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BCPRESSMAILNO. ^\/33^^l^ 
KINASES AND PHOSPHATASES 



TECHNICAL FIELD 

The invention relates to novel nucleic acids» kinases and phosphatases encoded by these 
5 nucleic acids, and to the use of these nucleic acids and proteins in the diagnosis, treatment, and 
prevention of cardiovascular diseases, immune system disorders, neurological disorders, disorders 
affecting growth and development, lipid disorders, cell proliferative disorders, and cancers. The 
invention also relates to the assessment of the effects of exogenous compounds on the expression of 
nucleic acids and kinases and phosphatases. 

10 

BACKGROUND OF THE INVENTION 
Reversible protein phosphorylation is the ubiquitous strategy used to control many of the 
intracellular events in eukaiyotic cells. It is estimated that mote than ten percent of proteins active in 
a typical mammalian cell are phosphorylated. lUnases catalyze the transfer of high-energy phosphate 

15 groups firom adenosine triphosphate (ATP) to target proteins on the bydroxyamino acid residues 
serine, threonine, or tyrosine. Phosphatases, in contrast, remove these phosphate groups. 
Extracellular signals including hormones, neurotransmittm, and growth and differentiation factors 
can activate kinases, which can occur as cell surface receptors or as the activator of the final effector 
protein, as well as other locations along the signal transduction pathway. Cascades of kinases occur, 

20 . as well as kinases sensitive to second messenger molecules. This system allows for the amplification 
of weak signals Qow abundance growth factor molecules, for example), as well as the synthesis of 
many weak signals into an all-or^-npthing response. Phosphatases, then, are essential in determining 
the extent of phosphorylation in the cell and, together with kinases, regulate key cellular processes 
such as metabolic enzyme activity, proliferation, cell growth and differentiation, cell adhesion, and 

25 cell cycle progression. 
KINASES 

Kinases comprise the largest known enzyme superfamily and vary widely in their target 
molecules. |Unases„catalyze-the transfer of high energy phosphate groups from a phosphate donor to 
a phosphate acceptor. Nucleotides usually serve as the phosphate donor in these reactions, with most 
30 kinases utilizing adenosine triphosphate (ATP). The phosphate acceptor can be any .of a variety of 

molecules, including nucleosides, nucleotides, lipids, carbohydrates, and proteins. Proteins are 
. > . phosphorylated on- hydroxyaminaacids. Addition of a phosphate group alters the local charge on the 
acceptor molecule, causing internal conformational changes and potentially influencing 
intermolecular contacts. Reversible protein phosphorylation is the primary method for regulating 
35 protein activity in eukaryotic cells. In general, proteins are activated by phosphorylation in response 

1 
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to extracellular signals such as honnones, neurotransmitters, and gro^wth and differentiation factors. 
The activated proteins initiate the ceU*s intracellular response by way of intracellular signaling 
pathways and second messenger molecules such as cyclic nucleotides, calcium-calmodulin, inositol, 
and various mitogens, that regulate protein phosphorylation. 

5 Kinases are involved in all aspects of a cell's function, from basic metabolic processes, such 

as glycolysis, to cell-cycle regulation, differentiation, and communication with the extracellular 
environment dirough signal transduction cascades. Inappropriate phosphorylation of proteins in cells 
has been linked to changes in cell cycle progression and cell differentiation. Changes in the cell cycle 
have been linked to induction of apoptosis or cancer. Changes in cell differentiation have been linked 

10 to diseases and disorders of the reproductive system, immune system, and skeletal muscle. 

There are two classes of protein kinases. One class, protein tyrosine kinases (FTKs), 
phosphorylates tyrosine residues, and the other class, protein serine/threonine kinases (STKs), 
phosphorylates serine and threonine residues. Some PTKs and STKs possess structural 
characteristics of both families and have dual specificity for both tyrosine and serine/threonine 

15 residues. Almost all kinases contain a conserved 250-300 amino acid catalytic domain containing 

i 

specific residues and sequence motifs characteristic of the kinase family. The protein kinase catalytic 
domain can be further divided into 1 1 subdomains. N-terminal subdomains I-IV fold into a two-lobed 
structure which binds and orients the ATP donor molecule, and subdomain V spans the two lobes. C- 
terminal subdomains VI-XI bind the protein substrate and transfer the gamma phosphate from ATP to 

20 the hydroxy I group of a tyrosine, serine, or threonine residue. Each of the 1 1 subdomains contains 
specific catalytic residues or amino acid motifs characteristic of that subdomain. For example, 
subdomain I contains an 8-amjno acid glycine-rich ATP binding consensus motif, subdomain n 
contains a critical lysine residue required for maximal catalytic activity, and subdomains VI through 
IX comprise the highly conserved catalytic core. PTKs and STKs also contain distinct sequence 

25 motifs in subdomains VI and VIQ which may confer hydroxyamino acid specificity. 

In addition, kinases may also be classified by additional amino acid sequences, generally 
between 5 and 100 residues, which either flank or occur within the kinase domain. These additional 
- . aminoacid sequences regulate "kinase actfvity and deteixn^^^ (Reviewed in 

Haidie, G. and S. Hanks (1995) The Protein Kinase Facts Book . Vol I, pp. 17-20 Academic Press, 

30 San Diego CA.). In particular, two protein kinase signature sequences have been identified in the 
kinase domain* the first containing an active site lysine residue involved in ATP binding, and the 
second containing an'aspartate residue important for catalytic activity. If a protein analyzed includes 
the two protein kinase signatures, the probability of that protein being a protein kinase is close to 
100% (PROSITE: PDOCOOlOO, November 1995). 

35 Protein Tvrodne Kinases 
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Protein tyrosine Idnases (PTKs) may be classified as either transmembrane, receptor PTKs or 
nontransmemfcorane, nonreceptor PTK proteins. Transmembrane tyrosine kinases function as 
receptors for most growth factors. Growth factors bind to the receptor tyrosine kinase (RTK), which 
causes the receptor to phosphorylate itself (autophosphorylation) and specific intracellular second 

5 messenger proteins. Growth foctors (GF) that associate with receptor PTKs include epidermal GF, 
platelet-derived GF, fibroblast GF, hepatocyte GF, insulin and insulin-like GFs, nerve GF, vascular 
endothelial GF, and macrophage colony stimulating factor. 

Nontransmembrane, nonreceptor PTKs lack transmembrane regions and, instead, form 
signaling complexes with the cytosolic domains of plasma membrane receptors. Receptors that 

10 function through non-receptor PTKs include those for cytokmes and hormones (growth hormone and 
prolactin), and antigen-specific receptors on T and B lymphocytes. 

Many PTKs were first identified as oncogene products in cancer cells in which PTK 
activation was no longer subject to normal cellular controls. In fact, about one third of the known 
oncogenes encode PTKs. Furthermore, cellular transformation (oncogenesis) is often acconq[>anied 

15 by increased tyrosine phosphorylation activity (Charbonneau, H. and N.K. Tonks (1992) Annu. Rev. 
Cell Biol. 8:463-493). Regulation of PTK activity may therefore be an important strategy in 
controlling some types of cancer. 
Protein Serine/Threonine Kinases 

Protein serine/threonine kinases (STKs) are nontransmembrane proteins. A subclass of STKs 

20 am known as ERKs (extracellular signal regulated kinases) or MAPs (mitogen-activated protein 
kinases) and are activated after ceil stimulation by a variety of hormones and growth factors. Cell 
stimulation induces a signaling cascade leading to phosphorylation of MEK (MAP/ERK kinase) 
which, in turn, activates ERK via serine and threonine phosphorylation. A varied number of proteins 
represent the downstream effectors fcr the active ERK and implicate it in the control of cell 

25 proliferation and differentiation, as well as regulation of the cytoskeleton. Activation of ERK is 
normally transient, and cells possess dual specificity phosphatases that are responsible for its down- 
regulation. Also, numerous studies have shown that elevated ERK activity is associated with^some 
^ . cancers; Other STKs include" the second messengCT dependent protein kinases such as the 

cyclic-AMP dependent protein kinases (PKA), calcium-calmodulin (CaM) dependent protein kinases, 

30 and the rmtogen-activated protein Idnases (MAP); the cyclin-dependent protein Idnases; checkpoint 
and cell cycle kinases; Numb-associated kinase (Nak); human Fused (hFu); proliferation-related 
kinases; 5 -AMP-activated protein kinaises; 'and kinases involved in apoptosis.- 

One member of the ERK family of MAP kinases, ERK 7, is a novel 61-kDa protein that has 
motif similarities to ERKi and ERK2, but is not activated by bxtracellular stimuli as are ERKl and 

35 ERK2 nor by the common activators, c-Jun N-terminal kinase (INK) and p38 kinase. ERK7 regulates 
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its nuclear localization and inhibition of growth through its C-terminal tail, not through the kinase 
domain as is typical with other MAP kinases (Abe, M.K. (1999) Mol. Cell. BioL 19:1301-1312). 

The second messenger dependent protein kinases primarily mediate the effects of second 
messengers such as cyclic AMP (cAMP), cyclic GMP, inositol triphosphate, phosphatidylinositol, 

5 3.4,5-triphosphate, cyclic ADP ribose, arachidonic acid, diacylglycerol and calcium-calmodulin. The 
PKAs are involved in mediating hormone-induced cellular responses and are activated by c AMP 
produced within the cell in response to hormone stimulation. cAMP is an intracellular mediator of 
hormone action in all animal cells that have been studied. Hormone-induced cellular responses 
include thyroid hormone secretion, Cortisol secretion, progesterone secretion, glycogen breakdown, 

10 bone resorption, and regulation of heart rate and force of heart muscle contraction. PKA is found in 
all animal cells and is thought to account for the effects of cAMP in most of these cells. Altered PKA 
expression is implicated in a variety of dismders and diseases including cancer, thyroid disorders, 
diabetes, atherosclerosis, and cardiovascular disease (Isselbacher, KJ. et at (1994) Harrison's 
Principles of Internal Medicine . McGraw-Hill, New Yoik NY. pp. 416-43 1, 1887). 

15 The casein kinase I (CKI) gene femjily is another subfamily of serine/threonine protein 

kinases. This continuously expanding group of kinases have been inq>licated in the regulation of 
numerous cytoplasmic and nuclear processes, including cell metabolism and DNA replication and 
repair. CKI enzymes are present in the membranes, nucleus, cytoplasm and cytoskeleton of 
eukaryotic cells, arid on the mitotic spindles of mammalian cells (Fish, KJ. et aL (1995) J. Biol. 

20 Chem. 270:14875-14883). 

The CKI family members all have a short amino-terminal domain of 9-76 amino acids, a 
' highly conserved Idnase domain of 284 anuno acids, sind a variable carboxyl-terminal domain that 
nmges from 24 to over 200 amino acids in length (Cegielska, A. et al. (1998) J. Biol. Chem. 
273:1357-1364). The CKI family is comprised of highly related proteins, as seen by the identification 

25 of isoforms of casein kinase I from a variety of sources. There are at least five mammalian isoforms, 
tt, p, y, S, and e. > Fish et aL identified CKI-epsilon from a human placenta cDNA library. It is a basic 
protein of 416 amino acids and is closest to CKI-delta. Through recombinant expre5siQn,it was 

_ determined, to phosphorylate'lmdwn CKI substrates and was intubited by the CKI-specific inhibitor 
' CKI-7. The human gene for CKI-epsilon was able to rescue yeast with a slow-growth phenotype 

30 caused by deletion of the yeast CKI locus, HRR250 (Fish et at., supra). 

The mammalian circadian mutation tau was found to be a semidominant autosomal allele of 
CKI-epsilon that markedly shortens period length of circadian rhythms in Syrian hamsters. The tau ' 
locus is encoded by casein kinase I-epsilon, which is also a homolog of the Drosophila circadian gene 
double-time. Studies of both the wildtype and tau mutant CKI-epsilon enzyme indicated that the 

35 mutant enzyme has a noticeable reduction in the maximum velocity and autophosphorylatton state. 
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Further, in vitro, CKI-^psilon is able to interact with mammalian PERIOD proteins, while the mutant 
enzyme is deficient in its ability to phosphoiylate PERIOD. Lowrey et aL have proposed that C!KI* 
epsilon plays a major role in delaying the negative feedback signal within the transcription- 
translation-based autoregulatory loop that composes the core of the circadian mechanism. Therefore 

5 the CKI-epsilon enzyme is an ideal target for pharmaceutical compounds influencing circadian 
rhythms, jet-lag and sleep, in addition to other physiologic and metabolic processes under circadian 
regulation (Lowrey, P-L. et al. (2000) Science 288:483-491). 

Homeodomain-interacting protein kinases (HEPKs) are serine/threonine kinases and novel 
members of the DYRK kinase subfamily (Hofmann, T.G. et al. (2000) Biochimie 82:1123-1127). 

10 HEPKs contain a conserved protein kinase domain separated from a domain that interacts with 

homeopro^ins. HEPKs are nuclear kinases, and HIPK2 is highly expressed in neuronal tissue (Kim, 
YJL et al. (1998) J. Biol. Chem. 273:25875-25879; Wang, Y. et al, (2001) Biochim. Biophys. Acta 
1518:168-172). HIPKs act as coreptessors for homeodomian transcription factors. Tliis coiepressor 
activity is seen in posttranslational modifications such as ubiquitination and phosphorylation, each of 

15 which are important in the regulation of cellular protein fimction (Kim, Y.H1 et aL (1999) Proc. NatL 
Acad. Sci. USA 96:12350-12355). 

The human h-warts protein, a homolog of Drosophila warts tumor siqtpcessor gene, maps to 
chromosome 6q24-25.1. It has a serine/threonine kinase domain and is localized to centrosomes in 
inteiphase cells. It is involved in mitosis and functions as a component of the mitotic apparatus 

20 (Nishiyama. Y. et al. (1999) FEBS Lett. 459:159-165). 
Gridmn-Cahnodulin De pendent Protein Kinases 

Calcium-cafanodulin dependent (CaM) kinases are involved in regulation of smooth muscle 
contraction, glycogen breakdown (phosphoiylase Idnase), and neurotransmission (CaM kinase I and 
CaM kinase H). CaM dependent protein kinases are activated by calmodulin, an intracellular .calcium 

25 receptor, in response to the concentration of free calcium in the cell. Many CaM kinases are also 
activated by phosphorylation. Some CaM kinases are also activated by autophosphorylation or by 
other regulatory kinases. CaM kinase I phosphorylates a variety of substrates including the 

— -neurotransnutter-relate(i"piofeins syM I and 11, the gene transcription regulator, CREB, and the 
cystic fibrosis conductance regulator protein, CFTR (Haribabu, B. et al. (1995) EMBO J. 14:3679- 

30 3686). C^aM kinase II also phosphorylates synapsin at different sites and controls the synthesis of 
catecholamines in the brain through phosphorylation and activation of tyrosine hydroxylase. CaM 
kinase il controls the synthesis of catecholamines, and seratonin, through phosphoiylation/activation 
of tyrosine hydroxylase and tryptophan hydroxylase, respectively (Fujisawa, H. (1990) BioEssays 
12:27-29). The mRNA encoding a calmodulin-binding protein kinase-Iike protein vvas found to be 

35 enriched in mammalian forebrain. This protein is associated with vesicles in both axons and 
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dendrites and accumulates largely postnatally. The amino acid sequence of this pioteia is similar to 
CaM-dependent STKs, and the protein binds calmodulin in the presence of calcium (Godbout, M. et 
al. (1994) J. Neurosci. 14:1-13). 
Mitogen^Activated Protein Kinases 

5 The mitogen-activated protein kinases. (MAP), which mediate signal transduction from the 

cell surface to the nucleus via phosphorylation cascades, are anoth^ STK family that regulates 
intracellular signaling pathways. Several subgroups have been identified, and each manifests 
different substrate specificities and responds to distinct extracellular stimuli (Egan, S£. and R. A. 
Weinbeig (1993) Natuie 365:781-783). There are three kinase modules comprising the MAP kinase 

10 cascade: MAPK (MAP), MAPK kinase (MAP2K, MAPKK, or MKK), and MKK kinase (MAP3K, 
MAPKKK, OR MEKK) (Wang,X,S. et al (1998) Biochem. Biophys. Res. Coramun. 253:33-37). The 
extracellular^regulated kinase (ERK) pathway is activated by growth factors and mitogens, for 
example, epidermal growth factor (EGF), ultraviolet light, hyperosmolar medium, heat shock, or 
endotoxic lipopolysaccharide (LPS). The closely related though distinct parallel pathways, the c-Jun 

15 N-temunal kinase (INK), or stress-activated kinase (SAPK) pathway, and the p38 kinase pathway ate 
activated by stress stimuli and proinflammatory cytokines such as tumor necrosis factor (TNF) and 
interleukin-1 QL-l). Altered MAP kinase expression is implicated in a varied of disease conditions 
including cancer, inflammation, inununje disorders, and disorders affecting growth and development. 
MAP kinase signaling pathways are present in manmialian cells as well as in yeast 

20 Cvclia-Deiiendent Protein IHnascs 

The cyclin-dependent protein kinases (CDKs) are STKs that control the progression of cells 
through the cell cycle. The entry and exit of a cell from mitosis are regulated by the synthesis and 
destruction of a family of activating proteins called cyclins. Cyclins are small regulatory proteins that 
bind to and activate CDKs, which then phosphorylate and activate selected proteins involved in the 

25 mitotic process. CDKs are unique in that they require multiple inputs to become activated. In * 
addition to cyclin binding, CDK activation requires the phosphorylation of a specific threonine 
residue and the dephosphorylation of a specific tyrosine residue on the CDK. 

^ , :Another farnily of-STKs associateki'with thi'celllcyd^ are the NBVIA (never in mitosis)- 

related kinases (Neks). Both CDKs and Neks are involved in duplication, maturation; and separation 

30 of the microtubule organizing center, the centrosome, in animal cells (Fry, A-M. et al. (1998) EMBO 
J. 17:470-481). 

' Checkpoint and Cell Cycle Kinases ... 

In the process of cell division, the order and timing of cell cycle transitions are under control 
of cell cycle checkpoints, which ensure that critical events such as DNA replication and chromosome 
35 segregation are carried out with precision. If DNA is damaged, e.g. by radiation, a checkpoint 
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pathway is activated that arrests the cell cycle to provide time for repair. If the damage is extensive, 
apoptosis is induced. In the absence of such checkpoints, the damaged DNA is inherited by aberrant 
cells which may cause proliferative disorders such as cancer. Protein kinases play an important role 
in this process. For example* a specific kinase, checkpoint kinase 1 (Chkl), has been identified in 

S yeast and manunals, and is activated by DNA damage in yeast Activation of Chkl leads to the anest 
of ttie cell at the G2/M transition (Sanchez, Y. et al. (1997) Science 277: 1497-1501). Specifically. 
Chkl phosphorylates the cell division cycle phosphatase CDC2S, inhibiting its normal function which 
• is to deijhosphorylate and activate the cyclin-dependent kmase Cdc2. Cdc2 activation controls the 
entry of cells into mitosis (Peng, C.-Y. et al- (1997) Science 277:1501-1505). Thus, activation of 

10 Chkl prevents the damaged cell from entering mitosis. A deficiency in a checkpoint kinase, such as 
Chkl, may also contribute to cancer by fulure to arrest cells with damaged DNA at other checkpoints 
such as G2/M. 

Proliferation-Rela^ Kinases 

Proliferation-related kinase is a serum/cytoldne inducible STK that is involved in regulation 

15 of the cell cycle and cell prolif^ation in human megakarocytic cells (Li, B. et al. (1996) J. Biol. 
Chem. 271: 194(J2-19408). Proliferation-related kinase is related to the polo (derived from 
DrosophUa polo gene) family of STKs implicated in cell division. Proliferation-related kinase is 
downregulated in lung tumor tissue and may be a proto-oncogene whose deregulated expression in 
normal tissue leads to oncogenic transformation. 

20 5*-AMP-activatcd protein kinase 

A ligand-activated STK protein kinase is 5 -AMP-activated protein kinase (AMPK) (Gao, G, 
et al. (1996) J. Biol Chem. 271:8675-8681). Mammalian AMPK is a regulator of fatty apid and sterol 
synthesis through phosphorylation of the enzymes acetyl-CoA carboxylase and 
hydroxymethylglutaryl-CoA reductase and mediates responses of these pathways to cellular stresses 

25 such as heat shock and depletion of glucose and ATP. AMPK is a heterotrimeric complex comprised 
of a catalytic alpha subunit and two non-catalytic beta and gamma subunits that are believed to 
regulate the activity of the alpha subunit Subunits of AMPK have a much wider distribution in 
non-lipogenic-tissues such-as brainrheart,'spleeh,'an<l lung'tiian expected. This distribution suggests 
that its role may extend beyond regulation of lipid metabolism alone. 

30 Kinases in Apoptosis 

Apoptosis is a highly regulated signaling pathway leading to cell death that plays a crucial 
role in tisstie development and homeostasis. Deregulation of this process-is associated with the 
pathogenesis of a number of diseases including autoimmune diseases, neurodegenerative disorders, 
and cancer. Various STKs play key roles in this process. ZIP kinase is an STK containing a 

35 C-terminal leucine zipper domain in addition to its N-terminal protein kinase domain. This 
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C-tenninal domain appears to mediate homodimerization and activation of the Idnase as well as 
interactions with transcription factors such as activating transcription &ctor, ATF4, a member of the. 
cycHc-AMP responsive element binding protein (ATF/CREB) family of transcriptional factors 
(Sanjo, H. et al. (1998) J. Biol. Chem. 273:29066-29071)- DRAKl and DRAK2 are STKs that share 

5 homology with the death-associated protein kinases (DAP kinases), known to function in interferon-y 
induced apoptosts (Sanjo et al., supra). Like ZIP kinase, DAP kinases contain a C-tenninal 
protein-protein interaction domain, in die form of ankyrtn repeats, in addition to the N-terminal 
kinase domain. ZIP, DAP, and DRAK kinases induce morphological changes associated with 
apoptosis when transfected into NIH3T3 cells (Sanjo et al-, supra). However, deletion of either the 

10 N-terminal Idnase catalytic domain or the C-tenninal domain of these proteins abolishes apoptosis 
activity, indicating that in addition to the kinase activity, activity in the C-temiinal domain is also 
necessary for apoptosis, possibly as an interacting domain with a regulator or a specific substrate. 

RICK is another STK recently identified as mediating a specific apoptotic pathway involving 
the death receptor, CD95 C&iohara, N. et al. (1998) J. Biol. Chem. 273:12296-12300). CD95 is a 

15 member of the tumor necrosis factor repeptor superfamily and plays a critical role in the regulation 
and homeostasis of the immune system (Nagata, S. (1997) Cell 88:355-365). The CD95 receptor 
signaling pathway involves recruitment of various intracellular molecules to a receptor complex 
following ligand binding. This process includes recruitment of the cysteine protease caspase-8 
which, in turn, activates a caspase cascade leading to cell death. RICK is composed of an N-terminal 

20 kinase catalytic domain and a C-tenninal "caspase-recruitment" domain that interacts with 
qaspase-like domains, indicating that RICK plays a role in the recruitment of caspase-8. This 
interpretation is supported by the fact that the expression of RICK in human 293T cells promotes 
activation of caspase-8 and potentiates the induction of apoptosis by various proteins involved in the 
GD95 apoptosis pathway (Inohara et aL, supra). 

25 Mitochondrial Protein Kinases 

A novel class of eukaryodc kinases, related by sequence to prokaryotic histidine protein 
kinases, are the mitochondrial protein kinases (MPKs) which seem to have no sequence similarity 
. with-other-eukaryotic protein* kinasesT These protein Idnases are located exclusively in the 

mitochondrial matrix space and may have evolved from genes originally present in respiration- 

30 dependent bacteria which were endocytosed by primitive eukaryotic cells. MPICs are responsible for : 
phosphorylation and inactivation of the branched-<:hain alpha-ketoacid dehydrogenase and pyruvate 
dehydrogenase complexes (Harris, R.A. et al. (1995) Adv, Enzyme Regul; 34: 147-162>. Five MPKs • • • 
have been identified. Four members correspond to pyruvate dehydrogenase kinase isozymes, 
regulating the activity of the pyruvate dehydrogenase complex, which is an important regulatory 

35 enzyme at the interface between glycolysis and the citric acid cycle. The fifth member corresponds to 
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a branched-chain alpha-ketoacid dehydrogenase kinase, important in the regulation of the pathway for 
the disposal of branched-chain amino acids, (Harris, R.A. et a!. (1997) Adv. Enzyme Regul. 37:271- 
293). Both starvation and the diabetic state are known to result in a great increase in the activity of 
the pyruvate dehydrogenase kinase in the liver, heart and muscle of the rat This increase contributes 

5 in both disease states to the phosphorylation and inactivation of the pyruvate dehydrogenase complex 
and conservation of pyruvate and lactate for gluconeogenesis (Harris (1995) supra), 
KINASES WITH NON-PROTEIN SUBSTRATES 
Lipid and Ino^atol kinases 

Lipid kinases phosphorylate hydroxyl residues on lipid head groups. A family of kinases 

10 involved in phosphorylation of phosphatidylinositol (PI) has been described, each member 

phosphorylating a specific carbon on the inositol ring (Leevers, SJ. et al. (1999) Cunr. Opin. Celt. 
Biol- 11:219-225). The phosphorylation of phosphatidylinositol is involved in activation of the 
protein kinase C signaling pathway. The inositol phospholipids (phosphoinositides) intracellular 
signaling pathway begins with binding of a signaling onolecule to a G-protein linked receptor in the 

15 plasma membrane. This leads to the phosphorylation of phosphatidylinositol (PI) residues on the 
inner side of the plasma membrane by inositol kinases, thus converting PI residues to the biphosphate 
state (PIP2). PIP2 is then cleaved into inositol triphosphate (IP3) and diacylglycerol. These two 
products act as mediators for separate signaling pathways. Cellular responses lliat are mediated by 
these pathways are glycogen breakdown in the liver in response to vasopressin, smooth muscle 

20 contraction in response to acetylcholine, and thrombin-induced platelet aggregation. 

PI 3-ldnase (PDIQ, which phosphorylates the D3 position of PI and its derivatives, has a 
central role in growth factcn: signal cascades involved in cell growth, differentiation, and metabolism. 
PDK is a heterodimer consisting of an adapter subunit and a catalytic subunit The adapter subunit 
acts as a scaffolding protein, interacting with specific tyrosine-phosphorylated proteins, lipid 

25 nK)ietie5, and other cytosolic factors. When the adapter subunit binds tyrosine phosphorylated 

targets, such as the insulin responsive substrate (IRS)-i, the catalytic subunit is activated and converts 

PI (4,5) bisphosphate (PIP2) to PI (3,4,5) P3 (PIP3). PIP3 then activates a nundber of other proteins, 

. - -T -including'PKATpfotern Bnase B (PKB), protein kinase C (PKC), glycogen synthase kinase (GSK)-3, 
and p70 ribosomal s6 kinase. PBK also interacts directly with the cytoskeletal organizing proteins, 

30 Rac; rho, and cdc42 (Shepherd, P.R. et al. (1998) Biochem. J. 333:471-490). Animal models for 

diabetes, such as obese and fat mice, have altered PBK adapter subunit levels. Specific mutations in 
the adapter subunit have also been found in an insulin-resistant Danish population, suggesting-a role 
for PDK in type-2 diabetes (Shepard, supra). 

An example of lipid kinase phosphorylation activity is the phosphorylation of 

35 D-erythro-sphingosine to the sphingolipid metabolite, sphingosine-1 -phosphate (SPP). SPP has 
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emer^d as a novel'Iipid second-messenger with both extracellular and intracellular actions (Kohama, 
T. et aL (1998) J. Biol. ChenL 273:23722-23728). ExtraceUularly, SPP is a ligand for the G-protein 
coupled receptor EDG-1 (endothelial-derived, G-protein coupled receptor). Intracellularly, SPP 
regulates cell growth, survival, motility, and cytoskeletal changes. SPP levels are regulated by 

5 spbingosine kinases that specifically phosphorylate D-erythro-spWngosine to SPP. The ioqx>rtance of 
sphingosine Idnase in cell signaling is indicated by the fact that various stimuli, including 
platelet-derived giowth factor (PDGF), nerve growdi factor, and activation of protein Idnase C, 
incfease cellular levels of SPP by acdvation of sphingosine kinase, and the fact that competitive 
inhibitors of die enzyme selectively inhibit cell proliferation induced by PDGF (Kohama et al., 

10 supra). 

Purine Nudeotide Kinases 

The purine nucleotide kmases, adenylate kinase (ATP: AMP phosphotransferase, or AdK) and 
guanylate kinase (ATPrOMP phosphotransferase, or GuK) play a key role in nucleotide metabolism 
and arc crucial to the synthesis and regulation of cellular levels of ATP and GTP, respectively.. These 

15 two molecules are precursors in DNA and RNA synthesis in growing cells and provide the primary 
source of biochemical energy in cells (ATP), and signal transduction pathways (OTP). Inhibition of 
various steps in the synthesis of these two molecules has been the basis of many antiproliferative 
drugs for cancer and antiviral therapy (Pillwein, K et al. (1990) Cancer Res. 50:1576-1579). 

AdK is found in almost all cell types and is especially abundant in cells having high rates of 

20 ATP synthesis and utilization such as skeletal muscle. In these cells AdK is physically associated 
with mitochondria and myofibrils, the subcellular structures that are involved in energy production 
and utilization, respectively. Recent studies have demonstrated a major function for AdK in 
transferring high energy phosphoryls from metabolic processes generating ATP to cellular 
components consuming ATP (Zeleznikar, RJ. et al. (1995) J. Biol. Chem. 270:7511-7319). Thus 

25 AdK may have a pivotal role in maintaining energy production in cells, particularly those having a 
^ high rate of grovs^ or metabolism such as cancer cells, and may provide a target for suppression of its 

activity in order to treat certain cancers. Alternatively, reduced AdK activity may be a source of 

. various metaboliCrmuscIe=energy dtsorders~ttiat cib~result'in cardiac or respiratory failure and may be 

treatable by increasing AdK activity. 

30 GuK, in addition to providing a key step in the synthesis of GTP for RNA and DNA 

synthesis, also fulfills an essential function in signal transduction pathways of cells through the 
regulation of GDP and GTP. Specifically, GTP binding to membrane associated G proteins mediates 
the activation of cell receptors, subsequent intracellular activation of adenyl cyclase, and production 
of the second messenger, cyclic AMP. GDP binding to G proteins inhibits these processes. GDP and 

35 GTP levels also control the activity of certain oncogenic proteins such as p21'" Iqiown to be involved 
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in control of cell pioliferation and oncogenesis (Bos« J.L. (1989) Cancer Res. 49:4682-4689). High 
ratios of GTP:GDP caused by suppression of GuK cause activation of p21"° and proniote 
oncogenesis. Increasing GuK activity to increase levels of GDP and reduce the GTPiGDP rado may 
provide a therapeutic strategy to reverse oncogenesis. 

5 GuK is an in^ortant enzyme in the phosphorylation and activation of certain antiviral drugs 

useful in the treatment of herpes virus infections*. These drugs include the guanine homologs 
acyclovir and buciclovir (Miller, W.H. and R.L. Miller (1980) J. Biol. Chem. 255:7204-7207; 
Stenberg, K. et al. (1986) J. Biol. Chem. 261:2134*2139). Increasing GuK activity in infected celts 
may provide a therapeutic strategy for augnnenting the effectiveness of these drugs and possibly for 

10 reducing the necessary dosages of the drugs. 

PyriiiplHlnii Kinasftg 

The pyrimidine kinases are deoxycytidine kinase and thymidine kinase 1 and 2. 
Deoxycytidineldnase is located in the nucleus^ and thynudine kinase 1 and 2 are found in the cytosol 
(Johansson. M, et al. (1997) Proc. Natl. Acad. Sci. USA 94:1 1941-1 1945). Phosphorylation of 
15 deoxyribonupleosides by pyrimidine kinases provides an alternative pathway for de novo synthesis of 
DNA precursors. The role of pyrimidine kinases, like purine kinases, in phosphorylation is critical to 
the activation of several chemotherapeutically important nucleoside analogues (Amer E.S. and S. 
Eriksson (1995) Pharmacol. Ther. 67: 155-186). 
PHOSPHATASES 

20 Protein phosphatases are generally characterized as either serine/threonine- or tyrosine- 

specific based on their preferred phospho-anuno acid substrate. However, some phosphatases (DSPs, 
for dual specificity phosphatases) can act on phosphorylated tyrosine, serine, or threonine residues. 
The protein serine/threonine phosphatases (PSPs) are important regulators of many cAMP-mediated . 
hormone responses in cells. Protein tyrosine phosphatases (PTPs) play a significant role in cell cycle 

25 . and cell signaling processes. Another family of phosphatases is the acid phosphatase or histidine acid 
phosphatase (HAP) family whose members hydrolyze phosphate esters at acidic pH conditions. 

PSPs are found in the cytosol, nucleus, and mitochondria and in association with cytoskeletal 
. and jnembranous structures-in most tissues; especially the brain. , Some PSPs require divalent cations, 
such as Ca** or Mn^, for activity. PSPs play important roles in glycogen metabolism, muscle 

30 contraction, protein synthesis, T cell function, neuronal activity, oocyte maturation, and hepatic 

metabolism (reviewed in Cohen, P. (1989) Annu. Rev. Biochem. 58:453-508). PSPs can be separated 
' into two classes. The PPP class includes PPl, PP2A, PP2B/calcineurin, PP4, PP5, PF6, and PP7. 
Members of this class are composed of a homologous catalytic subunit bearing a very highly 
conserved signatui^ sequence, coupled with one or more regulatory subunits (PROSFTE 

35 PDOCOOl 15). Further interactions with scaffold and anchoring molecules determine the intracellular 
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localization of PSPs and substrate specificity. The PPM class consists of several closely related 
isoforms of PP2C and is evolutionarily unrelated to the H?P class. 

PPl dephosphorylates many of the proteins phosphorylated by cyclic AMP-dependent protein 
kinase (PKA) and is an important regulator of many cAMP-mediated hormone responses in cells. A 

S number of isoforms have been identified, with the alpha and beta forms being produced by alternative 
splicing of the same gene. Both ubiquitous and tissue-specific targeting proteins for PPl have been 
identified. In the brain, inhibition of PPl activi^ by the dopamine and adenosine 3',S'- 
monophosphate-regulated phosphoprotein of 32kDa ARPP-32) is necessary for normal dopamine 
lesponse in neostriatal neurons (reviewed in Price, N JB. and M.C. Mumby (1999) Cuir. Opin. 

10 Neurobiol. 9:336-342). PPl, along with PP2A. has been shown to limit nK>tiIity in microvascular 
endothelial cells, suggesting a role for PSPs in the inhibition of an^ogenesis (Gabel, S. et at. (1999) 
Otolaryngol. Head Neck Surg. 121:463-468). 

PP2A is the main serine/threonine phosphatase. The core PP2A enzyme consists of a single 
36 kDa csitalytic subunit (fJ) associated with a 6S kDa scaffold subunit (A), whose role is to recruit 

15 additional regulatory subunits (B). Three gene families encoding B subunits are known (PRSS, PR61> 
and PR72)veach of which contain multiple isoforms, and additional fEunilies may exist (Millward, 
TA et al. (1999) Trends Biosci. 24:186-191). These **B-type" subunits are cell type- and tissup- 
specific and determine the substrate specificity, enzymatic activity, and subcellular localization of the 
holoenzyme. ThePR55 family is highly conserved and bears a conserved motif (PROSITE 

20 PDOC00785). PR55 increases PP2A activity toward mitogen-activated protein kinase (MAPK) and 
MAPK kinase (MEK). PP2A dephosphorylates the MAPK active site, inhibiting the cell's entry into 
mitosis. Several proteins can compete with PR55 for PP2A core enzyme binding, including the CKII 
kinase catalytic subunit, polyomavirus middle and small T antigens, and SV40 small t antigen. 
Viruses may use this mechanism to commandeer PP2A and stimulate progression of the cell through 

25 the cell cycle (Pallas, D.C et al. (1992) J. Viroh 66:886-893). Altered MAP kinase expression is also 
implicated in a variety of disease conditions including cancer, inflammation, immune disorders, and 
. disorders affecting growth and development. PP2A, in fact, can dephosphorylate and modulate the 
. . activities, otmore-than-30 protein'kinases^>^vi^?•^^ and other evidence suggests that the same is true in 
vivo for such kinases as PKB, PKC, the calmodulin-dependent kinases, ERK family MAP kinases, 

30 cyclin-dependent kinases, and the IkB kinases (reviewed in Millward et al., supra). PP2A is itself a 
substrate for CKI and CKH kinases, and can be stimulated by polycationic macromolecules. A PP2A-' 
like phosphatase is necessary to maintain the'Gl phase destruction of mammalian cyclins A and B • 
(Bastians, H. et al. (1999) Mol. Biol. Cell 10:3927-3941). PP2A is a major activity in the brain and is 
implicated in regulating neurofilament stability and normal neural function, particularly the 

35 phosphorylation of the microtubule-associated protein tau. Hyperphosphorylation of tau has been 
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proposed to lead to the neuronal degeneration seen in Alzheimer*s disease (reviewed in Price and 

Mumby, supra). ' 

PP2Bt or catcineurin, is a Ca^-activated dimeric phosphatase and is particularly abundant in 

the brain. It consists of catalytic and regulatory subunits, and is activated by the binding of the 
5 calcium/calmodulin complex. Calcineurin is the target of the immunosuppressant drugs* cyclosporine 

and FKS06. Along with other cellular factors, these drugs interact with calcineurin and inhibit 

phosphatase activity. In T cells, this blocks the calcium dependent activation of the NF-AT fanuly of 

transcription factors, leading to immunosuppiession. This family is widely distributed, and it is likely 

that calcineurin regulates gene expression in other tissues as well. In neurons, calcineurin modulates 
10 functions which range from the inhibition of neurotransmitter release to desensitization of 

postsynaptic NMDA-receptor coupled calcium channels to long term memory (reviewed in Price and 

Munoby, supra). ^ 

Other members of the PPP class have recently been identified (Cohen, P.T. (1997) Trends 

Biochem. Sci. 22:245-251)- One of them, PP5, contains regulatory domains with tetratricopeptide 
15 repeats. It can be activated by polyunsaturated fatty acids and anionic phospholipids in vitro and 

appears to be involved in a number of signaling pathways, including those controlled by atrial 

natriutedc peptide or steroid hormones (reviewed in Andreeva, A.V. and M.A. Kutuzov (1999) Cell 

Signal. 11:555-562). 

PP2C is a '^2kDa monomer with broad substrate specificity and is dependent on divalent 

20 cations (m£unly Mn^* or Mg^) for its activity, PP2C proteins share a conserved N-terminal region 
with an invariant DGH motif, which contains an aspartate residue involved in cation landing 
(PROSrrE PDOC007W). Taigeting proteins and mechanisms regulating PP2C activity have not 
been identified. PP2C has been shown to inhibit the stress-responsive p38 and Jun Idnase (JNK) 
pathways (Takekawa, M. et al. (1998) EMBO J. 17:4744-4752). 

25 In contrast to PSPs,tyrosine-specific phosphatases (PTPs) are generally monomeric proteins 

of very diverse size (from 20kDa to greater than lOOkDa) and structure that function primarily in the 
transduction of signals across the plasma membrane. PTPs are categorized as either^soluble 
phosphatases or transmembrane receptor pfdteihs that contain a phosphatase domain. All PTPs share 
a conserved catalytic domain of about 300 amino acids which contains the active site. The active site 

30 consensus sequence includes a cysteine residue which executes a nucleophilic attack on the phosphate 
moiety during catalysis (Neel. B.G. and N.K. Tonks ( 1997) Chirr, Opin. Cell Biol. 9: 193-204). 
Receptor PTPs are made up of an N-ternwnal extracellular domain of variable length,- a * • 

transmembrane region, and a cytoplasmic region that generally contains two copies of the catalytic 
domain. Although only the first copy seems to have enzymatic activity, the second copy apparently 

35 affects the substrate specificity of the first The extracellular domains of some receptor PTPs contain 
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fibronectin-life repeats, immunoglobulin-Uke domains, MAM domains (an extracellular motif likely 
to have an adhesive function), or carbonic anhydrase-like domains (PROSITE PDOC 00323). This 
wide* variety of structural motifs accounts for the diversity in size and specificity of PTPs. 

FTPs play iirqjortant roles in biological processes such as cell adhesion, lynq)hocyte 
5 activation, and cell proliferation. PTPs n and k are involved in cell-cell contacts, perhaps regulating 
cadherin/catenin function. A number of PTPs affect cell spreading, focal adhesions, and cell motility, 
most of them via the integrin/tyrosine kinase signaling pathway (reviewed in Neel and Tonks, supra). 
CD45 phosphatases regulate signal transduction and lymphocyte activation (Ledbetter, J.A. et al. 
(1988) Proc. NaU. Acad. Sci. USA 85:8628-8632). Soluble PTPs containing STC-homology-2 
10 domains have been identified (SHPs), suggesting that these molecules might interact with receptor 
tyrosine kinases. SHP-1 regulates cytokine receptor signaling by controlling the Janus family PTKs 
in hematopoietic cells, as well as signaling by the T-cell receptor and c-Kit (reviewed in Neel and 
Tonks, supra). M-phase inducer phosphatase plays a key role in the induction of mitosis by 
dephosphorylatmg and activating the PTK CDd. leading to cell division (Sadhu, K. et al. (1990) 
15 Proc. Natl. Acad. Sci. USA 87:5 139-5 143). In addition, the genes encoding at least eight PTPs have 
been mapped to chromosomal regions that are translocated or rearranged in various neoplastic 
conditions, including lymphoma, small cell lung caijcinoma, leukenua, adenocarcinoma, and 
neuroblastoma (reviewed in Charboraieau, H. and N.K. Tonks (1992) Annu. Rev. Cell Biol. 8:463- 
493), The FTP enzyme active site comprises llie consensus sequence of the MTMl gene fanuly . The 
20 MTMl gene is responsible for X-linked recessive myotubular myopathy, a congenital muscle 

disorder that has been linked to Xq28 (Kioschis, P. et at. (1998) Genomics 54:256-266). Many PTKs 
are encoded by oncogenes, and it is well known that oncogenesis is often accon^anied by increased 
tyrosine phosphorylation activity. It is therefore possible that PTPs may serve to prevent or reverse 
cell transformation and the growth of various cancers by controlling the levels of tyrosine 
25 phosphorylation in cells. This is supported by studies showhig that overexpression of FTP can 

suppress transformation in cells and that specific inhibition of FTP can enhance ceil transformation 
(Charbonneau and Tonks, ^upra). . „ - - - - - -r 

.Apyrases ate enzymes that-efficientlyhydiroly»^ and ADP and may function either intra- 
or extracellularly.. One type of apyrase, ATP-diphosphohydrolase, catalyzes the hydrolysis of 
30 phosphoanhydride bonds of nucleoside tri- and di-phosphates in the presence of divalent cations 
(Nourizad, N. et al.. (2003) Protein Purif. 27:229-237). 

Dual specificity phosphatases (DSPs) are structurally more similar to the PTPs than the PSPs. 
DSPs bear an extended PTP active site motif with an additional 7 amino acid residues. DSPs are 
primarily associated with cell proliferation and include the cell cycle regulators cdc25A, B, and C. 
35 The phosphatases DUSPl and DUSP2 inactivate the MAPK family members ERK (extracellular 
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signal-regulated kinase). JNK (c-Jun N-terminal kinase), and p38 on both tyrosine and threonine 
residues (PROSITE PDOC 00323, supra). In the activated state, these kinases have been implicated 
in neuronal differentiation, proliferation, oncogenic transformation, platelet aggregation, and 
apoptosis. Thus, DSPs are necessary for proper regulation of these processes (Muda, M. et al. (1996) 

5 J. Biol. Chem. 27 1:27205-27208). The tumor suppressor FTEN is a DSP tiiat also shows lipid 
phosphatase activity. It seems to negatively regulate interactions With the extraicellular matrix and 
maintains sensitivity to apoptosis. PTEN has been implicated in the prevention of angiogenesis (Giri, 
D. and M. Ittmann (1999) Hum. Pathol. 30:419-424) and abnormalities in its expression are 
associated with numerous cancers (reviewed in Tamura, M. et al. (1999) J. Natl. Cancer Inst 

10 91:1820-1828). 

Histidine acid phosphatase (HAP; EXPASY EC 3.1.3.2), also known as acid phosphatase, 
hydrolyzes a wide spectrum of substrates including alkyl. aryl. and acyl orthophosphate monoesters 
and phosphorylated proteins at low pH. HAPs share two regions of conserved sequences, each 
centered around a histidine residue which is involved in catalytic activity. Members of the HAP 
15 fenuly include lysosomal acid phosphatase (LAP) and prostatic acid phosphatase (PAP), both 
sensitive to inhibition by L-tartrate (PROSITE PDOC00538). 

Synaptojanin. a polyphosphoinositide phosphatase, dephosphorylates phosphoinositides at 
positions 3, 4 and 5 of the inositol ring. Synaptojanm is a major presynaptic protein found at clathrin- 
coated endocytic intermediates in nerve terminals, and binds the clathrin coat-associated protein. 
20 EPS15. This binding is mediated by the C-terminal region of synaptojanin-170, which has 3 Asp-Pro- 
Phe anuno acid repeats. Further, tiiis 3 residue rqieat had been found to be the binding site for die 
EHdomains of EPS15 (Haffher. C. et al. (1997) FEES Lett 419:175-180). Additionally, 
synaptojanm may potentially regulate interactions of endocytic proteins with tfie plasma membrane, 
and be involved m synaptic vesicle recycling (Brodm. L. et al. (2000) Curr. Opin. Neurobiol. 10:312- 
25 320). Studies in ndce witii a targeted disruption in ttie synaptojanin 1 gene (Synj 1) were shown to 
support coat formation of endocytic vesicles more effectively tiian was seen in wild-type mice, 
suggesting that Synj 1 can act as a negative regulator of mentibrane-coat protein interactions. These 
~ fiHdin^ provide ^n^cevMence for a crucial role of phosphoinositide metabolism in synaptic 

ve^cle recycling (Cremona, O. et al. (1999) Cell 99:179-188). 
3D Expression profiling 

Microarrays are analytical tools used in bioanalysis. A microarray has a plurality of 
molecules spatially distributed over, and stably associated witfi, the surface of a solid support. 
Microarrays of polypeptides, polynucleotides, and/or antibodies have been developed and find use in 
a variety of applications, such as gene sequencing, monitoring gene expression, gene mapping, 
35 bacterial identification, drug discovery, and combinatorial chemistry. 
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One area in particular in which microairays find use is in gene expression analysis. Array 
technology can provide a simple way to explore the expression of a single polymorphic gene or the 
expression profile of a large number of related or unrelated genes. When the expression of a single 
gene is examined, arrays are employed to detect the expression of a specific gene or its variants. 
5 When an expression profile is examined, arrays provide a platform for identifying genes that are 
tissue specific, are affected by a substance being tested in a toxicology assay, are part of a signaling 
cascade, carry out housekeeping functions, or are specifically related to a particular genetic 
predisposition, condition* disease, or disorder. 
Lun^ Cancer 

10 The potential application of gene expression profiling is particularly relevant to improving 

diagnosis, prognosis, and treatment of cancer, such as lung cancer. Lung cancer is the leading cause 
of cancer death in the United States, affecting more than 100,000 men and 50,000 women each year. 
Nearly 90% of the patients diagnosed with lung cancer are cigarette smokers. Tobacco smoke 
contains thousands of noxious substances that induce carcinogen metabolizing enzymes and covalent 

IS DNA adduct' formation in the exposed bronchial epithelium. In nearly 80% of patients diagnosed 
with Itmg cancer, metastasis has already occunred. Most commonly lung cancers metastasize to 
pleura,.brain, bone, pericardium, and liver. The decision to treat with surgery, radiation therapy, or 
chemotherapy is made on the basis of tumor histology, response to growth factors or hormones, and 
sensitivity to inhibitors or drugs. With current treatments, most patients die within one year of 

20 diagnosis. Earlier diagnosis and a systematic approach to identification, staging, and treatment of 
lung cancer could positively affect patient outcome. 

* Lui^g cancers progress through a series of morphologically distinct stages from hyperplasia 
to invasive carcinoma. Malignant lung cancers are divided into two groups comprising four 
histopatholQgical classes. The Non Small Cell Lung Carcinoma (NSCLC) group includes squamous 

25 cell carcindmas, adenocatxsinomas, and laige cell carcinomas and accounts for about 70% of all lung 
cancer cases. Adenocarcinomas typically, arise in the peripheral aurways and often form mucin 
secreting glands. Squamous cell carcinomas typically arise in proTumaLairways. JEhe histogenesis of 

sQGaiiidu's'f^irdmtinoiim^^ be related to chronic inflammation and injury to the bronchial 

epithelium, leading to squamous metaplasia. The Small Cell Lung Carcinon^ (SCLQ group 

30 accounts for about 20% of lung cancer cases. SCLCs typically arise in proximal airways and exhibit 
a number of paraneoplastic syndromes including inappropriate production of adrenocorticotropin and 
anti-diuretic hormone. • .... ... .... 

Lung cancer cells accumulate numerous genetic lesions, many of which are associated with 
cytologically visible chromosomal aberrations. The high frequency of chromosomal deletions 

35 associated with lung cancer may reflect the role of multiple tumor suppressor loci in the etiology of 
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this disease. Deletion of the short ann of chromosome 3 is found In over 90% of cases and represents 
one of the earKest genetic lesions leading to lung cancer. Deletions at chromosome arms 9p and 17p 
are also common. Other frequently observed genetic lesions include overexpression of telomerase, 
activation of oncogenes such as K-tas and c-myc, and inactivation of tumor suppressor genes such as 

S RB.pS3andCDKN2. 

Genes differentially regulated in lung cancer have been identified by a variety of methods. 
Using mRNA differential display technology, Manda ^ aL (1999; Genomics 5 1:5-14) identified five 
genes differcntiaUy expressed in lung cancer cell lines compared to normal bronchial epithelial cells. 
Among the known genes, puhnonary surfactant apoprotein A and alpha 2 macroglobuUn were down 
10 regulated whereas nffl23Hl was upregulated. Petersen sL^. (2000; Int J. Cancer. 86:512-517) used 
suppression subtractive hybri«fization to identify 552 clones differentially ex^essed in lung tumor 
derived ceU lines, 205 of which represented known genes. Among tba known genes. 
tfuombospondin-1, fibronectin. intercellular adhesion molecule 1, and cytokeratins 6 and 18 were 
previously observed to be differentially expressed in lung cancers. Wang et^ (2000; Oncogene 
15 19:1519-1528) used a combination of microarray analysis and subtractive hybridization to identify 17 
genes differentially overexpresssed in squamous cell carcinoma compared with normal lung 
epithelium. Among the known genes they identified were keratin isoform 6, KOC, SPRC, IGFb2. 
connexin 26, plakofillin 1 and cytokeratin 13. 

There is a need in tfie art for new compositions, mcluding nucleic acids and proteins, for the 
20 diagnosis, prevention, and treatment of cardiovascular diseases, immune system disorders. 

neurological disorders, disorders affecting growtii and development, lipid disorders, cell proliferative 
disoid^, and cancers. 

SUMMARY OF THE INVENTION 

25 Various embodiments of tiie invention provide purified polypeptides, kinases and 

phosphatases, referred to collectively as 'KPP' and individually as 'KPF-l.' •KPP-2.' 'KPP-S.' 'KPP- 
4.' and 'KPP-5' and metiiods for usingjhese proteins an^ tbeireofiodingpolynucleotides forthe^ 

- -det^abn, dia^S'arid'tiMtai^ rf^^ medical conditions. Embodiments also provide 

metiiods for utilizing tiie purified kinases and phosphatases and/or their encoding polynucleotides for 

30 facilitating the drug discovery process, including determination of efficacy, dosage, toxicity, and 
pharmacology. Related embodiments provide methods for utilizing die purified kinases and 
phosphatases and/or tiieir encoding polynucleotides for mvestigating tfie pathogenesis of diseases and 
medical conditions. 

An embodiment provides an isolated polypeptide selected ftom tiie group consisting of a) a 
35 polypeptide comprising an amino acid sequence selected from tfie group consisting of SEQ ID NO: 1- 
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5, b) a polypeptide comprising a naturaHy occmring amino acid sequence at least 90% identical or at 
least about 90% identical to an amino acid sequence selected from tfie group con»sting of SEQ ID 
NO:l-5. c) a biologically active fragment of a polypeptide having an amino acid sequence selected 
fiwmthe group consisting of SEQ ID N0:l-5, and d) an immunogenic fragment of a polypeptide 
5 having an amino acid sequence selected from the grouij consisting of SEQ ID N0:l-5. Another 

embodiment provides an isolated polypeptide comprising an amino acid sequence of SEQ ID NO: 1-5. 

StiU another embodiment provides an isolated polynucleotide encoding a polypeptide 
selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected 
from the group consisting of SEQ ID NO:l-5. b) a polypeptide comprismg a naturally occurring 
10 amino acid sequence at least 90% identical or at least about 90% identical to an amino acid sequence 
selected from the group consisting of SEQ ID NO:l-5, c) a biologically active fragment of a 
polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 1-5, 
and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from tiie 
group consisting of SEQ ID NO: 1-5. In another embodiment, the polynucleotide encodes a 
15 polypeptide selected from the group consisting of SEQ ID NO:l-5. In an alternative embodiment, the 
polynucleotide is selected from the group consisting of SEQ ID NO:6-10. 

Still another embodiment provides a recombinant polynucleotide comprising a promotCT 
sequence operably linked to a polynucleotide encoding a polypeptide selected from tiie group 
consisting of a) a polypeptide comprising an amino acid sequence selected fifom the group consisting 
2fi of SEQ ID NO: 1-5, b) a polypeptide comprising a naturally occurring anuno acid sequence at least 
90% identical or at least about 90% identical to an amino acid sequence selected from the group 
consisting of SEQ ID NO: 1-5, c) a biologically active fragment of a polypeptide having an amino acid 
sequence selected from the group consisting of SEQ ID NO:l-5, and d) an immunogenic fragment of 
a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 1-5. 
25 Another embodiment provides a cell transformed with the recombinant polynucleotide. Yet another 
embodiment provides a transgenic organism comprising the recombinant polynucleotide. 

Anotfier embodiment provides amethod for producfflgApolypeptide-selected fi« 
desisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting 
of SEQ ID NO: 1-5, b)' a polypeptide conqniang a naturally occurring aiiunQ arid sequence at least 
30 90% identical or at least about 90% identical to an amino acid sequence selected from the group 

consisting of SEQ ID Na 1-5. c) a biologically active fragment of a polypqitide having an amino acid 
sequence selected from the group consisting of ffiQ ro NO: 1-5. and d) an iranumogenic fhign^ 

a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:l-5. 
The metfiod comprises a) culturing a cell under conditions suitable for expression of the polypeptide, 
• 35 wherein said cell is transformed with a recombinant polynucleotide comprising a promoter sequence 
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ppen^ly lhj»d «. . polyoudcodde «.codtag U« po>ypep.ite and b) ^covering pCpep^de s. 

Yet another embodiment provides an isolated antibody which specifically binds to a 
polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid 
5 sequenceselectedfromthegro«pconsistingofSEQIDNO:l-5.b)apolypeptidecomprisinga 

naturally occurnngaminoacid sequence at least 90% identical or at least about90%-.^^^^^^^ 
aminoacidsequenceselectedfromthegroupconsistingofSEQlDNO:l-5.c)abiolog«aUyact^^ 

fragment of a polypeptide having an amino acid sequence selected torn the group cons^^ 
ID NO: 1-5, and d) an immunogenic fragment of a polypeptide havmg an amino add sequence 
10 selected from the group consisting of SEQ ID NO:l-5. 

Still yet another embodiment pmvides an isolated polynucleotide selected from the group 
consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the gro^p 
consistingofSEQIDNO:6-10,b)apolyn«cleotidecomprisinga«at«rallyoccurri^ 
sequence at least 90% identicalor at least about 90% identical toapolynucleodde sequent 
15 frbmlhegroupconsistingofSEQlDNOre-lCOapolynucleotidecon^lementarytothe 
polynucleotide ofa).d)apolynucleotidecomplementarytothepolynucleo^^ 
Iiuivalent of a)-d). hiother embodiments, the polynucleotide can c^^ 

60, 80, or 100 contiguous nucleotides. 

Yet another embodfanentprovidesamethod for detectmgatargetpoiynucleotidemas^^^^ 

20 saidtargetpolynucleotidebeingselec.edftomti«groupconsisting6fa)apolynucte^^^^ 
apolym.cleotidesequenceselectedfromfl»groupconsistingofSEQIDNO:6.lO,b)a 

polynucleotidecompcisinganaturallyoccumngpolynucleotidesequ^^^ 

,ea« about 90% identical to a pplynucleotide sequence selected from ti,e group cons.stmg of SEQ ID 
NO-6.10. c) a polynucleotide conq,lementa,y to the polynucleotide of a), d) a polynucleotxde 
25 compIemenUrytothepolynucleotideofb).ande)a«RNAequivalentofa)-d).The 

comprisesa)hybridizingti«samplewiti»aprobecomprisingatleast20contiguous_n^^^^^ 

comprisingasequencecomplew.--*^*-^^^-'^"-'^^^^^^ 

= - s^ificaliy hy W<Uz^ to said target polynucleotide, under conditions whereby a hybnd«at.on 

compleK is formed between s.ud probe and said target polynucleotide or fragments thereof, and b) 
30 detectingthepresenceorabsenceofsa.dhybridizationcomplex.h.arelatedembodimen^ 

™eti«d can include detecting the amount of the hybridization complex. In still other embodunents. 
■ theprobecancompriseatleastabout20.30.40.60.80,orl00contiguousnucleotides..-. 

Still yet another embodiment provides a method for detecting a target polynucleotide m a 
sample, said tatget polynucleotide beingselectedfrom the groupconsistingofa)^^^^^ 
35 comprismgapolynucleotidesequenceselectedfromti.egroupconsistingofSEQmNa.6^^^^ 
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polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical or at 
leEist about 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID 
NO:6-10, c) a polynucleotide con^leraentary to the polynucleotide of a), d) a polynucleotide 
complementary to the polynucleotide of b), and e) an RNA equivalent of a)-d). The method 

5 comprises a) amplifying said target polynucleotide or fragment thereof using polymerase chain 
reaction amplification, and b) detecting the presence or absence of said amplified target 
polynucleotide or fragment thereof. In a related embodiment, the method can include detecting the 
amount of the aniplified target polynucleotide or fragment thereof. 

Another embodiment provides a composition comprising an effective amount of a 

10 polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid 
sequence selected from the group consisting of SEQ ID NO: 1-5, b) a polypeptide comprising a 
naturally occuiring amino acid sequence at least 90% idmtical or at least about 90% identical to an 
amino acid sequence selected from the group consisting of SEQ ID NO: 1-5, c) a biologically active 
fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ 

15 ID NO: 1-S, and d) an immunogenic fragment of a polypeptide having an amino acid sequence 

selected fiDom die group consisting of SEQ ID NO: 1-5, and a pharmacenlically acceptable excipient. 
In one embodiment, the composition can conq>rise an amino acid sequence selected from the group 
consisting of SEQ ID NO: 1-5. Other embodiments provide a method of treating a disease or ' 
condition associated with decreased or abnormal expression of functional KPP, comprising 

20 administering to a patient in need of such treatment die composition. 

Yet another embodiment provides a method for screening a conqpound for effectiveness as an 
agonist of a polypeptide selected from the group consisting of a) a polypeptide conqirising an amino 
acid sequmce selected from the group consisting of SEQ ID NO: 1-5, b) a polypeptide comprising a 
naturally occurring amino acid sequence at least 90% identical or at least about 90% identical to an 

25 amnio acid sequence selected from the group consisting of SEQ ID NO: 1-5, c) a biologically active 
fragment of a polypeptide having an amino acid sequence selected from the group consi^ting .olSEQ 
ID NO:l;-S, and (c!);anJnununogemc*fragm6nf of a'l^'oi^eptide having an amino acid sequence 
selected from the group consistmg of SEQ ID NO: 1-5. The method comprises a) contacting a sample 
comprising the polypeptide with a compound, and b) detecting agonist activity in the sample. 

30 Another embodiment provides a composition comprising an agonist compound identified by the 

method and a pharmaceutically acceptable excipient Yet another embodiment provides a method of 
treating a disease or condition associated with decreased expression of functional KPP, comprising 
administering to a patient in need of such treatment the composition. 

Stili yet another embodiment provides a method for screening a compound for effectiveness 

35 as an antagonist of a polypeptide selected from the group consisting of a) a polypeptide comprising an 

20 
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amino acid sequence selected f«,m^group consisting Of SEQmNO:!-^ 
compri^ng a naturally occurring anrino acid sequence at least 90% identical or at least about 90% 
identical to an amino acid sequence selected from the group consisting of SEQ ID NO: 1 -5. c) a 
biolo^caUy active fragment of a polypeptide having an amino acid sequence selected from the ^oup 
5 consisting of SEQ ID NO: 1-5, and d) an immunogenic fragment of a polypeptide having an am?no 
acidsequenceselectedfromthegroupconsistingofSEQIDNO:1.5.Themethodcomprisesa) 

contactingasample comprising thepolypeptidewithaco^pound.andb)detecti^ 
•m the sample. Another embodiment provides a composition comprising an antagonist compound 
identified by the method and a pharmaceuticaUy acceptable excipient. Yet another embodiment 
10 provides a method of treating a disease or condition associated wid, over«xpression of functional 

KPP. comprising administering to a patient in need of such treatment the composition. 

Anotherembodimentprovidesamethodofscreeningforacompoundthatspecificaliybmds 

to a polypeptide selected from the group consisting of a) a polypeptide comprising an ammo acid 
sequence selected from tiie group consisting of SEQ ID NO: 1-5. b) a polypeptide compnsmg a 
15 nat„raUyoccurringaminoacidsequenceatleast90%identicaloratleastabout90%ide«ticaltoan 

aminoacidsequenceselectedfromthegroupeonsistingofSEQlDNO:l-5.c>abiologicanya^^^ 
fragment of a polypeptide having an amino acid sequence selected from the group consistmg of SEQ 
ID NO- 1-5, and d) in immunogenic fragment of a polypeptide havmg an amino acid sequence 
selectedfromdiegroupconsistingofSEQIDNO:1.5. The method comprises a) combming tiic 
20 polypeptide with at least one test compound under suitable conditions, and b) detec^g binding of the 
polypeptide to die test compound, ti»reby identifying a compound timt specifically binds to the 

'"''•''Yt another embodiment provides a meU.«^ 

activity of a polypeptide selected from tite g«>up consisting of a) a polypeptide comprising an ammo 
25 acid sequence selected from the group consisting of SEQ ID NO:l-5. b) a polypeptide comprismg a 
„aturallyoccurringaminoacidsequenceatleast90%identicaloratleast^^^ 

aminoacidsequen,ps,el^,edfromth.gro„pconsistingOfSEQID^.^^^^^^^ 
- -f^gmentof apolypeptidehavinganarrunoacidsequenceselectedfromthegn^^^ 

IDNO:l-5.andd)animmunogenicfragmentofapolypeptidehavin8anaminoacidsequence 

30 selectedftomd.egroupconsistingofSEQIDNO:1.5. method comprises a) combining die 
polypeptidewidiatleastonetestcompoundunderconditionspermissivefortheactivuyofthe 

polypeptide.b)asses^ngtHeactivftyofti.epolypeptideinti,epresence<rfthetest^^^^^^ 
comparing the activity oftiie polypeptide in die presence ofthe test compound With die activity of^^^ 
polypeptideinUieabsenceofthetestcompound.whereinachangeinUieactivityofthepolypept.de 
35 i„tiiepresenceofthetestcompoundisindicativeofacoinpoundU«.tmodulatesUieactivityofthe 
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polypeptide. 

Still yet another embodiment provides a method for screening a compound for effectiveness 
in altering expression of a target polynucleotide, wherein said target polynucleotide comprises a 
polynucleotide sequence selected from the group consisting of SEQ ID NO:6-10, the method 
5 comprising a) contacting a sample comprising tite target polynucleotide with a compound, b) 
detecting altered expression of the target polynucleotide, and c) comparing the expression of the 
target polynucleotide in the presence of varying amounts of the compotmd and in the absence of the 
compound. 

Another embodiment provides a method for assessing toxicity of a test compound, said 

10 method comprising a) treating a biological sample containing nucleic acids with the test compound; 
b) hybridizing the nucleic acids of the treated biological sample with a probe conq>rising at least 20 
contiguous nucleotides of a polynucleotide selected from the group consisting of i) a polynucleotide 
comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO:6-10, ii) a 
polynucleotide coiiq)rising a naturally occurring polynucleotide sequence at least 90% identical or at 

15 least about 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID 
NO:6-10, iii).a polynucleotide having a sequence complementary to i), iv> a polynucleotide 
complementary to the polynucleotide of ii), and v) an RNA equivalent of i>-iv). Hybridization occurs 
imder conditions whereby a specific hybridization complex is formed between said probe and a target 
polynucleotide in the biological sample, saidtaiget polynucleotide selected from the group consisting • 

20 of i) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of 
SEQ ID NO:6'10, ii) a polynucleotide comprising a naturally occurring polynucleotide sequence at 
least 90% identical or at least about 90% identical to a polynucleotide sequence selected from tiie 
group consisting of SEQ ID NO:6-10, iii) a polynucleotide con^lementary to the polynucleotide of 
i), iv) a polynucleotide complementary to the polynucleotide of ii), and v) an RNA equivalent of i)- 

25 iv). Alternatively,, the target polynucleotide can conq)rise a fragment of a polynucleotide selected 
from the group consisting of i)-v) above; c) quantifying the amount of hybridization complex; and d) • 
comparing the amount of hybridization complex ixi ttie.treated.biological sample-with the -amount of * 
hybridization complex in an untreated biological sample, wherein a difference in the amount of 
hybridization complex in the treated biological sample Is indicative of toxicity of the test compound. 

30 

BRIEF DESCRIPTION OF THE TABLES 
" Table 1 sununarizes the non^hclature for full' length polynucleotide and polypeptide • • 
embodiments of the invention. 

Table 2 shows the GenBank identification number and annotation of the nearest GenBank 
35 homolog, and the PROTEOME database identification numbers and annotations of PROTEOME 
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database homotogs, for polypeptide embodiments of the invention. The probability scores for the 
matches between each polypeptide and its homoiog(s) are also shown. 

Table 3 shows structural features of polypeptide embodiments, including predicted motifs 
and domains, along with the methods, algorithms, and searchable databases used for analysis of the 
5 polypeptides. 

Table 4 lists the cDNA and/or genonuc DNA fragments which were used to assemble 
polynucleotide embodiments, along with selected fragments of the polynucleotides. 

Table 5 shows representative cDNA libraries for polynucleotide embodiments. 
Table 6 provides an appendix which describes the tissues and vectors used for construction of 
10 the cDNA libraries shown in Table 5. 

Table 7 shows the tools, programs, and algorithms used to analyze polynucleotides and 
polypeptides, along with applicable descriptions, references, and threshold panuneters. 

Table 8 shows single nucleotide polymorphisms found in polynucleotide sequences of the 
invention, along with allele frequencies in different human populations. 
15 . ) 

DESCRIPTION OF THE INVENTION 
Before the present proteins, nucleic acids, and methods are described, it is understood that 
eihbodiments of the invention are not Itnuted to the particular machines, instruments, materials, and 
methods described, as these may vary. It is also to be understood that the terminology used herein is 
20 for the purpose of describing particular embodiments only, and is not intended to limit the scope of 
the invention. 

As used herein and in the appended claims, the singular forms * V "an," and "the" mclude 
plural reference unless the context clearly dictates otherwise. Thus, for example, a reference to ^^a 
host cell" includes a plurality of such host cells, and a reference to "an antibody" is a reference to one 
25 or more antibodies and equivalents thereof known to those skilled in the ait, and so forth. 

Unless defined otherwise, all technical and scientific terms used herein have the same 
meanings as commonly understood by one of ordinary sMU in the^^ 

Although imylnoachines, materials, and methods similar or equivalent to those described herein can be 
used to practice or test the present invention, the preferred machines, materials and methods are now 
30 described. All publications mentioned herein are cited for the purpose of describing and disclosing 
the cell lines, protocols, reagents and vectors which are reported in the publications and which might 
be used in connection with various embodiments of the invention. Nothing herein is to bexonstrued. 
as an admission that the invention is not entitled to antedate such disclosure by virtue of prior 
invention. 
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DEFINITIONS 

"KPF* refers to the amino acid sequences of substantially purified KPP obtained from any 
species, particularly a mammalian species, including bovine^ ovine, porcine, murine, equine, and 
human, and from any source, whether natural, synthetic, semi-synthetic, or recombinant. 

5 The term "agonist*' refers to a molecule which intensifies or mimics the biological activity of 

KPP. Agonists may include proteins, nucleic acids, carbohydrates, small molecules, or any other 
compoimd or composition which modulates the activity of KPP either by directly interacting with 
KPP or by acting on components of the biological pathway in which KPP participates. 

. An "allelic variant" is an alternative form of the gene encoding KPP. Allelic variants may 

10 result from at least one mutation in the nucleic acid sequence and may result in alt^ied mRNAs or in 
polypeptides whose structure or function may or may not be altered. A gene may have none, one, or 
many allelic variants of its naturally occurring form. Common mutational changes which give rise to 
allelic variants axe generally ascribed to natural deletions, additions, or substitutions of nucleotides. 
Each of these types of changes may occur alone, or in combination witii the others, one or more times 

IS in a given sequence. 

"Altered'* nucleic acid sequences encoding KPP include those sequences with deletions, 
insertions, or substitutions of different nucleotides, resulting in a polypeptide the same as KPP or a 
polypeptide with at least one functional characteristic of KPP. Included vrithin this definition are 
polymorphisms which may or may not be readily detectable using a particular oligonucleotide probe . 

20 of the polynucleotide encoding KPP, and improper or unexpected hybridization to allelic variants, 
with a locus other than the normal chromosomal locus for the polynucleotide encoding KPP. The 
encoded protein may also be '^altered," and may contain deletions, insertions, or substitutions of 
amino acid residues which produce a silent chan^ and result in a functionally equivalent KPP. 
Delibemte amino acid substitutions may be made on the basis of one or more similarities in polarity, 

25 charge, solubility, hydtophobicity, hydrophilicity, and/or the amphipathic nature of the residues, as 
long as the biological or immunological activity of KPP is retained. For example, negatively charged 
amino acids may include aspartic acid and gluta mic acid, and.RQsitiy.ely. charged amino acidsTmay 
mciude lysine and arginine. Amino acids with uncharged polar side chains having similar 
I hydrophilicity values may include: asparagine and glutamine; and serine and threonine. Anuno acids 

30 with uncharged side chains having similar hydrophilicity values may include: leucine, isoleucine, and . 
valine; glycine and alanine; and phenylalanine and tyrosine. 

^ The terms •'amino acid" and "amino acid sequence" can refer to an oligopeptide, a peptide, a 
polypeptide, or a protein sequence, or a fragment of any of these, and to naturally occurring or 
synthetic molecules. Where "anuno acid sequence" is recited to refer to a sequence of a naturally 

35 occurring protein molecule, ^^amino acid sequence*' and like terms are not meant to limit the amino 
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acid sequence to the complete native amino acid sequence associated with the recited protein 
molecule. 

"Amplification" relates to the production of additional copies of a nucleic acid. 
Amplification may be carried out using polymerase chain reaction (PGR) technologies or other 

S nucleic acid anq)lification technologies well known in the art. 

The term ''antagonist" refers to a molecule which inhibits or attenuates the biological activity 
of KPP. Antagonists may include proteins such as antibodies, anticalins, nucleic acids, 
carbohydrates, small molecules, or any other compound or composition which modulates the activity 
of KPP either by directly interacting with KPP or by acting on components of the biological pathway 

10 in which KPP participates. 

The term ''antibody** refers to intact immunoglobulin molecules as well as to Segments 
thereof, such as Fab, F(ab*)2» and Fv fragments, which are capable of binding an epitopic determinant. 
Antibodies that bind KPP polypeptides can be prepared using intact polypeptides or using fragments 
containing small p^tides of interest as the immuni^g antigen. The polypeptide or oligopeptide 

15 used to immunize an animal (e.g., a mouse, a rat, or a rablnt) can be derived from the translation of 
RNA, or synthesized chemically, and can be conjugated to a carrier protein if desired. Conunonly 
used carriers tiiat are chenucally coupled to peptides include bovine serum albumin, thyrogiobulin, 
and keyhole limpet hemocyanin (KLH). The coupled peptide is then used to immunize die animal. 
The term "antigenic determinant" refers to that region of a molecule (i.e., an epitope) that 

20 makes contact with a particular antibody. When a protein or a fragment of a protein is used to 
immunize a host animal, numerous regions of the protein may induce the production of antibodies 
which bind specifically to antigenic determinants (particular regions or three-dimensional structures 
on the protein). An antigenic determinant may compete with the intact antigen (i.e., the immunogen 
used to elicit the immune response) for binding to an antibody. 

25 The term "aptamer** refers to a nucleic acid or oligonucleotide molecule that binds to a 

specific molecular target. Aptamers are derived from an in vitro evolutionary process (e.g., SELEX 
(Systematic Evolution of Ligands by JEXponemial Enrichment), described in U.S. Patent No. 
5,270,163), which selects for target-specific aptamer sequences from large combinatorial libraries. 
Aptamer compositions may be double-stranded or single-stranded, and may include 

30 deoxyribonucleotides, ribonucleotides, nucleotide derivatives, or other nucleotide-like molecules. 

The nucleotide components of an aptamer may have modified sugar groups (e.g., the 2*-OH group of a 
ribonucleotide may be replaced by 2'-F or l-NHj)* which may improvea desired property, e.g., 
resistance to nucleases or longer lifetime in blood. Aptamers may be conjugated to other molecules, 
e.g., a high molecular weight carrier to slow clearance of the aptamer from the circulatory systenL 

35 Aptamers may be specifically cross-linked to their cognate ligands, e.g., by photo-activation of a 

25 
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cross-linker (Brody. E.N. and L. Gold (2000) J. Biotechnol. 74:5-13). 

The term "intramer" refers to an aptamer which is exinessed in vivo. For example, a vaccinia 
virus-based RNA expression system has been used to express specific RNA aptamers at high levels in 
the cytoplasm of leukocytes (Blind, M. et al. (1999) Pioc. Natl. Acad. Sci. USA 96:3606-3610). 

The term "spiegelmer" refers to an aptamer which includes L-DNA, L41NA, or other left- 
handed nucleotide derivatives or nucleotide-like molecules. Aptamers containing left-handed 
nucleotides are resistant to degradation by naturally occurring enzymes, which normally act on 
substrates containing right-handed nucleotides. 

The term "antisense" refers to any composition capable of base-pairing with the "sense" 
(coding) strand of a polynucleotide having a specific nucleic acid sequence. Antisense compositions 
may include DNA; RNA; peptide nucleic acid (PNA); oligonucleotides having modified backbone 
linkages such as phosphorothioates, methylphosphonates. or benzylphosphonates; oligonucleotides 
having modified sugar groups such as 2'-methoxyethyl sugars o^ 2'-methoxyethoxy sugars; or 
oligonucleotides having modified bases such as S-methyl cytosine, 2'-deoxyuracil. or 7-deaza-2'- 
15 deo-xyguanosine. Antisense molecules may be produced by any method including chemical synthesis 
or transcription. Once introduced into a cell, die complementary antisense molecule base-pairs widi a 
naturally occurring nucleic acid sequence produced by the cell to form duplexes which block either 
transcription or translation. The designation "negative" or "minus" can refer to the antisense strand, 
and the designation "positive" or "plus" can refer to the sense strand of a reference DNA molecule. 
20 The term "biologically active" refers to a protein having structural, regulatory, or biochemical 

functions of a naturally occurring molecule. Likewise, "immunologically active" or "immunogenic" 
refers to the capability of the natural, recombinant, or synAetic KPP, or of any oligopeptide thereof, 
to induce a specific immune response in appropriate animals or cells and to bind with specific 
antibodies. 

25 '"C^^nplementary" describes the relationship between two single-stranded nuclei 

sequences that anneal by base-pairing. Ftor example. 5'-AGT-3' pairs with its complement. 
S'-TCA-S*. 

A "corn^sitfon comprising a given polynucleotide" and a "composition comprising a given 
polypeptide" can referto any composition containing the given polynucleotide or polypeptide. The 

30 composition may comprise a dry formulation or an aqueous solution. Compositions comprising 
polynucleotides encoding KPP or fiagments of KPP may be employed as hybridization probes. The 
probes may be stored in fireeze-dried form and maybe associated witti a stabilizing agent such as a 
carbohydrate. In hybridizations, the probe may be deployed in an aqueous solution containing salts 
(e.g.. NaQ), detergents (e.g.. sodium dodecyl sulfate; SDS), and otiier components (e.g.. Denhardfs 

35 solution, dry milk, salmon sperm DNA, etc.). 
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"Consensus sequence" refers to a nucleic acid sequence which has been subjected to repeated 
DNA sequence analysis to resolve uncalled bases, extended using the XL-PCR kit (Applied ' 
Biosystems. Foster City CA) in the 5* and/or the 3' direction, and resequenced, or which has been 
assembled from one or more overlapping cDNA, EST, or genomic DNA fragments using a computer 

5 program for fragment assembly, such as the GEL VIEW fragment assembly system (Accelrys, 

Burlington MA) or Phrap (University of Washington, Seattle WA). Some sequences have been both 
extended and assembled to produce the consensus sequence. 

''Conservative anuno acid substitutions*' are those substitutions that are predicted to least 
interfere with the properties of the original protein, i.e-, the structure and especially the function of 

10 the protein is conserved and not significantly changed by such substitutions. Hie table below shows 
amino acids which may be substituted for an original amino acid in a protein and which are regarded 
as conservative amino acid substitutions. ' 



Original Residue 


Conservative Substitution 


Ala 


Gly, Ser 


Arg 


His, Lys 


Asn 


Asp, Gin, His 


Asp 


Asn, Glu 


Cys 


Ala, Ser 


Gin 


Asn, Glu, His 


Glu 


Asp, Gin, His 


Gly 


Ala 


His 


Asn, Arg, Gin, Glu 


He 


Leu, Val 


Leu 


He, Val 


Lys 


Arg, Gin, Glu 


Met 


Leu, He 


Phe 


His, Met, Leu. Ttp, Tyr 


Ser 


Cys, Thr 


Thr 


Ser, Val 


Trp 


Phe, Tyr . 


Tyr 


His, Phe, Trp 


Val 


He. Leu, Thr 



Conservative anuno acid substitutions^generally. maintain (a) the-stnicture of the polypeptide 
35 backbone in the aiea of the substitution, for example, as a Wta sheet or alpha helical conformation, 
(b) the charge or hydrophobicity of flie molecule at the site of the substitution, and/or (c) Uie bulk of 
the side chain. 

A "deletion" refers to a change in die amino acid or nucleotide sequence that results in the 
absence of one or more amino acid residues or nucleotides. 
40 The term "derivative" refers to a chemically modified polynucleotide or polypeptide. 

Chemical modifications of a polynucleotide can include, for example, replacement of hydrogen by an 
alkyi, acyl, hydroxyl, or amino group. A derivative polynucleotide encodes a polypeptide which 
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retains at least one biological or immunological function of the natural molecule. A derivative 
polypeptide is one modified by glycosylation, pegylation, or any similar process that retains at least 
one biological or immunological function of the polypeptide from which it was derived. 

A "detectable label" refers to a reporter molecule or enzyme that is capable of generating a 
5 measurable signal and is covalently or noncovalently joined to a polynucleotide or polypeptide. 

^Differential expression" refers to increased or upregulated; or decreased, downregulated, or 
absent gene or protein expression, determined by comparing at least two different samples. Such 
comparisons may be carried out between, for example, a treated and an untreated sample, or a 
diseased and a normal sample. 
10 '^Exon shuffling" refers to the recombination of different coding regions (exons). Since an 

exon may represent a structural or functional domain of the encoded protein, new proteins may be 
assembled through the novel reassortment of stable substructures, thus allowing acceleration of the 
evolution of new protein functions. 

A **fragment" is a unique portion of KPP or a polynucleotide enco^ng KPP which can be 
15 identical in sequence to, but shorter in length than, the parent sequ^ice. A fragment may comprise up 
to the entire length of the defined sequence, nunus one nucleotide/anuno acid residue. For example, a 
fragment may comprise from about 5 to about 1000 contiguous nucleotides or amino acid residues. A 
fragment used as a probe, primer, antigen, therapeutic molecule, or for other purposes, may be at least 
5, 10, 15, 16, 20, 25, 30, 40, 50, 60, 75. 100, 150, 250 or at least 500 contiguous nucleotides or amino 
20 acid residues in length. Fra^ents may be preferentially selected from certain regions of a molecule. 
For example, a polypeptide fragment may comprise a certain length of contiguous amino acids 
selected from the first 250 or 50Q amino acids (or first 25% or 50%) of a polypeptide as shown in a 
certain defined sequence. Clearly these lengths are exenq>lary, and any length that is supported by 
the specification, including the Sequence Listing, tables, and figures, may be enconqpassed by the 
25 present embodiments. 

A fragment of SEQ ID NO:6-lo'can comprise a region of unique polynucleotide sequence 
that specifically identifies SEQ ID NO:6.10, for exampje, as distinct from any p^^ in the 

- - genome frora SvffichlKfc fragl^ fragment of SEQ ID NO:6-10 can be employed in 

one or more embodunents of methods of the invention, for exanqple, in hybridization and 
30 amplification technologies and in analogous methods that distinguish SEQ ID NO:6-10 from related 
polynucleotides. The precise length of a fragment of SEQ ID NO:6-10 and the region of SEQ ID 
NO:6-iato which the fragment corresponds are routinely determinable by one of ordinary skill in the 
art based on the intended purpose for the fragment. 

A fragment of SEQ ID NO: 1-5 is encoded by a fragment of SEQ ID NO:6-10. A fragment of 
35 SEQ ID NO: 1-5 can comprise a region of unique amino acid sequence tfiat specifically identifies SEQ 
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ID NO: 1-S. For example, a fragment of SEQ ID NO: 1-S can be used as an immun€>gemc pepdde for 
the development of antibodies that specifically recognize SEQ ID NO: 1-5. The precise length of a 
fragment of SEQ ID NO: 1-S and the region of SEQ ID NO: 1*5 to which the fragment conbsponds can 
be detenmned based on the intended purpose for the fragment using one or more analytical methods 

5 described herein or otherwise known in the art. 

A **fidl length" polynucleotide is one containing at least a translation initiation codon (e.g., 
methionine) followed by an open reading frame and a translation termination codon. A "full length" 
polynucleotide sequence encodes a **f uU length" polypeptide sequence. 

"Homology" refers to sequence similarity or, alternatively, sequence identity, between two or 

10 more polynucleotide sequences or two or more polypeptide sequences. 

The terms **percent identity" and "% identity," as applied to polynucleotide sequences, refer 
to the percentage of identical nucleotide matches between at least two polynucleotide sequences 
aligned using a standardized algorithm. Such an algorithm may insert, in a standardized and 
reproducible way, gaps in the sequences being compared in order to optimize alignment between two 

15 sequences, and therefore achieve a more meaningful comparison of the two sequences. 

Percent identity between polynucleotide sequences may be determined using one or more 
computer algorithms or programs known in the art or described herein. For exan^le, percent identity 
can be determined using the default parameters of the CLUSTAL V algorithm as incorporated into 
the MEGALIGN version 3. 12e sequence alignment program. This program is part of the 

20 LASERGENE software package, a suite of molecular biological analysis programs (DNASTAR, 
Madison WI). CLUSTAL V is described in Higgins, D.G. and P.M. Sharp (1989; CABIOS 5:151- 
153) and in Higgins, D.G. et al. (1992; CABIOS 8:189-191). For pairwise alignments of 
polynucleotide sequences, the default parameters are .set as follows: Ktuple=s2, gap penalty=5, 
window=4, and *^diagonals saved"=4. The ''weighted" residue weight table is selected as the default 

25 Alternatively, a suite of commonly used and fteely available sequence comparison algorithms 

.which can be used is provided by the National Center for Biotechnology Information (NCBI) Basic 
Local Alignment Search Tool (BLAST) (Altschul, S.F. e^al4.1.99Q) J.MoIJBiol.-2 15:403-410); • 
which is avaiiabie from several sources, including the NCBI, Bethesda, MD, and on the Internet at 
,http://www.ncbi.nlm.hih.gov/BLAST/. The BLAST software suite includes various sequence 

30 analysis programs including **blastn/' that is used to align a known polynucleotide sequence with 

other polynucleotide sequences from a variety of databases. Also avaiiabie is a tool called ''BLAST 2 
Sequences*' that is used for direct pairwise comparison of two nucleotide sequences. - "BLAST 2 . 
Sequences*' can be accessed and used interactively at http://www.ncbi.nlm.nih,gov/gorf/bl2.htnil. 
The 'BLAST 2 Sequences" tool can be used for both blastn and blastp (cUscussed below). BLAST 

35 programs are commonly used with gap and other parameters set to default settings. For example, to 
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compare two nucleotide sequences, one may use blastn with the "BLAST 2 Sequences" tool Version 
2.0.12 (April-2 1-2000) set at default parameters. Such default parameters may be, for example: 

Matrix: BLOSUM62 

Reward for match: 1 ^ 
5 Penalty for mismatch: -2 

Open Gap: 5 and Extension Gap: 2 perudties 

Gap X drop-off: 50 

Expect: 10 

Word Size: 11 
10 Filter: on 

Percent identity may be measured over the length of an entire defined sequence, for example, 
as defined by a particular SEQ ID number, or may be measured over a shorter length, for example, 
over the length of a fragment taken from a larger, defined sequence, for instance, a fragment of at 
least 20, at least 30, at least 40, at least SO, at least 70, at least 100, or at least 200 contiguous 

15 nucleotides. Such lengths are exemplary only, and it is understood that any fragment length 

supported by the sequences shown herein, in the tables, figures, or Sequence Listing, may be used to 
describe a length over which percentage identity may be measured. 

Nucleic acid sequences that do not show a high degree of identity may nevertheless encode 
similar amino acid sequences due to the degeneracy of the genetic code. It is understood that changes 

20 in a nucleic acid sequence can be made using this degeneracy to produce multiple nucleic acid 
sequences that all encode substantially the same protein. 

The phrases "percent identity" and "% identity,*' as applied to polypeptide sequences, refer to 
the percentage of identical residue matches between at least two polypeptide sequences aligned using 
a standardized algorithm. Methods of polypeptide sequence alignment are well-known. Some 

25 alignment methods take into account conservative amino acid substitutions. Such conservative 

substitutions, explained in more detail above, generally preserve the charge and hydrophobicity at the 
site of substitution, thus preserving the structure (and therefore function) of tiieiiolypeptide. The — 

r • phrases "per<:afit similarity" and "% similarity," as applied to polypeptide sequences, refer to the 
percentage'of residue matches, including identical residue matches and conservative substitutions, 

30 between at least two- polypeptide sequences aligned using a standardized algorithm. In contrast, 

conservative substitutions are not included in the calculation of percent identity between polypeptide 

- sequences.' - . . .... 

Percent identity between polypeptide sequences may be determined using the default 
parameters of the CLUSTAL V algorithm as incorporated into the MEGALIGN version 3.12e 

35 sequence alignment program (described and referenced above). For pairwise alignments of 
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polypeptide sequences using CLUSTAL V, the default parameters are set as follows: Ktuple=l, gap 
pena]ty=3, window=S. and "diagonals saved"=5. The PAM250 matrix is selected as the default 
residue weight table. 

Alternatively the NCBI BLAST software suite may be used. For example, for a pairwise 
5 comparison of two polypeptide sequences, one may use the **BLAST 2 Sequences" tool Version 
2.0.12 (April-21-2000) with blastp set at default parameters. Such default parameters may be, for 
example: 

Matrix: BLOSUM62 

Open Gap: 1 1 and Extension Gap: 1 penalties 
10 Gap X drop-off: 50 

Expect: 10 
Word Size: 3 
Filter: on 

Percent identity may be measured over the length of an entire defined polypeptide sequence, 
15 . for example, as defined by a particular SEQ ID number, or may be measured over a shorter length, for 
example, over the length of a fragment taken firom a larger, defined polypeptide sequence, for 
mstance, a fragment of at least 15, at least 20, at least 30, at least 40, at least 50, at least 70 or at least 
150 contiguous residues. Such lengths are exemplary only, and it is understood that any fragment 
.length supported by the sequences shown herein, in the tables, figures or Sequence Listing, may be 
20 used to describe a length over which percentage identity may be measured 

'Human artificial chromosomes'* (HACs) are linear microchromosomes which may contain 
DNA sequences of about 6 kb to 10 Mb in size and which contain all of the elements required for 
chromosome replication, segregation and maintenance. 

The term "humanized antibody" refers to an antibody molecule in which the amino acid 
25 sequence in the non-antigen binding regions has been altered so that the antibody more closely 
resembles a human antibody, and still retains its original binding ability. 

"Hybridization** refers to the process by which a polyracleotidejtnm^^ with a- - 

complement^' strand through base pairing under defined hybridization conditions. Specific 

hybridization is an indication that two nucleic acid sequences share a high degree of 
30 complementarity. Specific hybridization complexes form under pernussive annealing conditions and " 
remain hybridized after the 'Svashing** step(s). The washing step(s) is particularly important in 
determining the stringency of the hybridization process, with more strin^nt conditions allowing less . 
non-specific binding, i.e., binding between pairs of nucleic acid strands that are not perfectly 
matched. Permissive conditions for annealing of nucleic acid sequences are routinely determinable 
35 by one of ordinary skill in the art apd may be consistent among hybridization experiments, whereas 
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wash conditions may be varied among experiments to achieve the desired stringency, and therefore 
hybridization specificity. Permissive annealing conditions occur, for example, at 6i''C in the 
presence of about 6 x SSC, about 1% (w/v) SDS. and about 100 jug/ml sheared, denatured salmon 
sperm DNA. 

5 Generally, stringency of hybridization is expressed, in part, with reference to the temperature 

under which the wash step is carried out Such wash temperatures are typically selected to be about 
S^'C to lO^C lower than the thermal melting point (T^ for the specific sequence at a defined ionic 
strength and pH. The T„ is the ten^eramre (under defined ionic strength and pH) at which 50% of 
the target sequence hybridizes to a perfectly matched probe. An equation for calculating and 
10 conditions for nucleic acid hybridization are well known and can be found in Sambrook, J. and D.W. 
Russell (2001; Molecular Cloning: A Laboratory Manual . 3rd ed., vol. 1-3, Cold Spring Harbor Press, 
Cbld Spring Harbor NY, ch. 9). 

High stringency conditions for hybridization between polynucleotides of the present 
invention include wash conditions of 68'*C in the presence of about 0.2 x SSC and about 0.1% SDS, 
15 for 1 hour. Alternatively, temperatures of about 65°C, 60°C, or 42''C may be used. SSC 
concentration may be varied from about 0. 1 to 2 x SSC, with SDS being present at about 0.1%. 
Typically, blocking reagents are used to block non-specific hybridization. Such blocking reagents 
include, for instance, sheared and denatured safanon sperm DNA at about 100-200 /ig/nd. Organic 
solvent, such as formamide at a concentration of about 35-50% v/v, may also be used under particular 
20 circumstances, such as for RNA:DNA hybridizations. Useful variations on tfiese wash conditions 
will be readily apparent to those of ordinary skill in the art. Hybridization, particularly under high 
stringency conditions, may be suggestive of evolutionary similarity between the nucleotides. Such 
similarity is strongly indicative of a similar role for the nucleotides and their encoded polyp^tides. 
The term "hybridization complex" refers to a complex formed between two nucleic acids by 
25 virtue of the fonnation of hydrogen bonds between complementary bases. A hybridization complex 
may be formed in solution (e.g., C^ot or Rot analysis) or formed between one nucleic acid present in 
solution and another nucleic acid immobilized on a solid support (e.g.,j)aper^j^^ 
- r- chips, ping or glass slides, or any otiiier appropriate substrate to which cells or tiieir nucleic acids have 
. been fixed). 

30 . The words "insertion" and "addition" refer to changes in an amino acid or polynucleotide 

sequence resulting in the addition of one or xnoTQ amino acid residues or nucleotides, respectively. 

• • ^^Immune response" can refer to conditions associated with inflammation, trauma, immune 
disorders, or Infectious or genetic disease, etc. These conditions can be characterized by expression 
of various factors. e.g.. cytokines, chemokines, and otfier signaling molecules, which may affect 

35 cellular and systemic defense systems. 
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An ^iramunogenic fragment" is a polypeptide or oligopeptide fragment of KPP which is 
capable of elicit'mg an immune response when introduced into a living organism, for example, a 
mammal. The term "immunogenic fragment" also includes any polypeptide or oligopeptide fragment 
of KPP which is useful in any of the antibody production methods disclosed herein or known in the 
5 art. 

The term "microarray" refers to an arrangement of a plurality of polynucleotides, 
polypeptides, antibodies, or other chemical compounds on a substrate* 

Ttie terms "element" and "array element" refer to a polynucleotide, polypeptide, antibody, or 
other chenucal compound having a unique and defined position on a microarray. 
10 The term '^modulate" refers to a change in the activity of KPP. For example, modulation may 

cause an increase or a decrease in protein activity, binding characteristics, or any other biological, 
functional, or immunological properties of KPP. 

The phrases "nucleic acid" and "nucleic acid sequence" refer to a nucleotide, oligonucleotide, 
polynucleotide, or any fragment thereof. These phrases also refer to DNA or RNA of genomic or 
15 synthetic origin which may be single-stranded or double-stranded and may represent the sense or tiie 
antisense strand, to peptide nucleic acid (PNA), or to any DN A-like or RNA-like material. 

"Operably linked" refers to tiie situation in which a first nucleic acid sequence is placed in a 
functional relationship with a second nucleic acid sequence. For instance, a promoter is operably 
linked to a coding sequence if the promoter affects tiie transcription or expression of the coding 
20 sequence. Operably linked DNA sequences may be in close proximity or contiguous and, where 
necessary to join two protein coding regions, in the same reading frame. 

*Teptide nucleic acid" (PNA) refers to an antisense molecule or anti-gene agent which 
comprises an oligonucleotide of at least about 5 nucleotides in length linked to a peptide backbone of 
amino acid residues ending in lysine. The terminal lysine confers solubility to the composition. 
25 PNAs preferentially bind complementary single stranded DNA or RNA and stop transcript 
elongation, and may be pegylated to extend their lifespan in the cell. 

*Tost-translational modification" of an IffP may inyolYeJipida^^ ' 
phc^phbrylatibn, acetylation, racemization, proteolytic cleavage, and ottier modifications known in 
the art. These processes may occur synthetically or biochenncally. Biochemical modifications will 
30 vary by cell type depending on the enzymatic milieu of KPP. 

"Probe" refers to nucleic acids encoding KPP, their complements, or fragments thereof, 
which are used to detect identical, allelic or related-nucleic acids. Probes are isolated . - . 
oligonucleotides or polynucleotides attached to a detectable label or reporter molecule. Typical 
labels include radioactive isotopes, ligands, chemi luminescent agents, and enzymes. "Primers" are 
35 short nucleic acids, usually DNA oligonucleotides, which may be annealed to a target polynucleotide 
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by complementary base-pairing. The pximer may then be extended along the target DNA strand by a 
DNA polymerase enzyme. Primer pah's can be used for an^>l|fication (and identification) of a nucleic 
acid, e.g., by the polymerase chain reaction (PGR). 

Probes and primers as used in the present invention typically comprise at least IS contiguous 

5 nucleotides of a known sequence. In order to enhance specificity, longer probes and primers may also 
be employed, such as probes and primers that comprise at least 20, 25, 30, 40, 50, 60, 70, 80, 90, 100, 
or at least 150 consecutive nucleotides of the disclosed nucleic acid sequences. Probes and primers 
may be considerably longer than these examples, and it is understood that any length supported by the 
specification, including the tables, figures, and Sequence Listing, may be used. 

10 Methods for preparing and using probes and primers are described in, for example, 

Sambrook, J. and D.W. Russell (2001; Molecular Cloning: A Laboratory Manual , 3rd ed., vol. 1-3, 
Cold Spring Harbor Press, Cold Spring Harbor NY), Ausubel, F.M. et al. (1999; Short Protocols in 
. Molecular Biology . 4"* ed., John Wiley & Sons, New York NY), and Innis, M. et al. (1990; PGR 
Protocols, A Guide to Methods and Applications . Academic Press, San Diego CA). PCR primer pairs 

15 can be derived from a known sequence, for example, by using computer programs intended for that 
purpose such as Primer (Version 0.5, 1991, Whitehead Institute for Biomedical Research, Cambridge 
MA). 

Oligonucleotides for use as primers are selected using software known in the art for such 
purpose. For example, OLIGO 4.06 software is useful for the selection of PCR primer pairs of up to 

20 100 nucleotides each, and for the analysis of oligonucleotides and larger polynucleotides of up to 
5,000 nucleotides from an input polynucleotide sequence of up to 32 kilobases. Similar primer 
selection programs have incorporated additional features for expanded capabilities. For example, the 
PrimOU primer selection program (available to the public from the Genome Center at University of 
. , Texas South West Medical Cent^, Dallas TX) is capable of choosing specific primers from 

25 megabase sequences and is thus useful for designing primers on a genome-wide scope. The Primer3 
- pru»r selection program (available to the public from the Whitehead Institute/MIT Center for 
Genome Research, Cambridge MA) allows the user to input a "nusprimingjibra^^^^ r- - 

: ... sequences to avoid as prithef binding sites are user-specified. Primer3 is useful, in particular, for tiie 
.selection of oligonucleotides for microarrays. (The source code for the latter two primer selection 

30 programs may also be obtained from their respective sources and modified to meet the user's specific 
needs.) The PrimeGen program (available to the public from the UK Human Genome Mapping 

Project Resource Centre, Cambridge UK) designs primers based on multiple sequence alignments, 

thereby alIo)ving selection of primers that hybridize to either the most conserved or least conserved 
regions of aligned nucleic acid sequences. Hence, this program is useful for identification of both 

35 unique and conserved oligonucleotides and polynucleotide fragments. The oligonucleotides and 
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polynucleotide fragments identified by any of the above selection methods are useful in hybridization 
technologies^ for exan:q)le, as PGR or sequencing primers, microarray elements, or specific probes to 
identify fully or partially complementary polynucleotides in a sample of nucleic acids. Methods of 
oligonucleotide selection are not limited to those described above. 
5 A 'Recombinant nucleic acid" is a nucleic acid that is not naturally occurring or has a 

sequence that is made by an artificial combination of two or more otherwise separated segments of 
sequence. This artificial combination is often accomplished by chemical synthesis or, more 
conunonly, by the artificial manipulation of isolated segments of nucleic acids, e.g., by genetic 
engineering techniques such as those described in Sambrook and Russell (supra). The term 
10 recombinant includes nucleic acids that have been altered solely by addition, substitution, or deletion 
of a portion of the nucleic acid. Frequently, a recombinant nucleic acid may include a nucleic acid 
sequence operably linked to a promoter sequence. Such a recombinant nucleic acid may be part of a 
vector that is used, for example, to transform a cell. 

Alternatively, such recombinant nucleic acids may be part of a viral vector, e.g.. based on a 
15 - vaccinia virus, that could be use to vaccinate a mammal wherein the recombinant nucleic acid is 
.expressed, inducing a protective immunological response in the mammal. 

A ^"regulatory element" refers to a nucleic acid sequence usually derived firom untranslated 
regions of a gene and includes enhancers, promoters, intions, and 5' and 3' untranslated regions 
(UTRs). Regulatory elements interact witfi host or viral proteins which control transcription, 
20 . translation, or RNA stability. 

Reporter molecules'* are chemical or biochemical moieties used for labeling a nucleic acid, 
amino acid,- or antibody. Reporter molecules include radionuclides; enzymes; fluorescent, 
. chemiluminescent, or chromogenic agents; substrates; cofactors; inhibitors; magnetic particles; and 
. other moieties known in the art 
25 An 'TtNA equivalent," in reference to a DNA molecule, is composed of the same linear 

sequence of nucleotides as the reference DNA molecule with the exertion that all occurrences of the 
nitrogenous base thymine are replaced with uracil, and the sugar_.tack^ ^ ■ 

- -.instead^of'dedxyribose. 

The term ''sample" is used in its broadest sense. A sample suspected of containing KPP, 
30 • nucleic acids encoding KPP, or fragments thereof may comprise a bodily fluid; an extract from a cell, 
chromosome, organelle, or membrane isolated from a cell; a cell; genomic DNA, RNA, or cDNA, in 

- solution OF bound to a substrate; a tissue; a. tissue print; etc 

The terms "specific binding" and "specifically binding" refer to that interaction between a 
protein or peptide and an agonist, an antibody, an antagonist, a small molecule, or any natural or 
35 synthetic binding composition. The interaction is dependent upon the presence of a particular 

35 



PF-1688P 

structure of the protein, e.g., the antigenic deternunant or epitope, recognized by the binding 
molecule. For example, if an antibody is specific for epitope "A," the presence of a polypeptide 
comprising the epitope A, or the presence of free unlabeled A, in a reaction containing firee labeled A 
and the antibody will reduce the amount of labeled A that binds to the antibody. 

5 The term "substantially purified" refers to nucleic acid or amino acid sequences that are 

removed from their natural environment and are isolated or separated, and are at least about 60% free, 
preferably at least about 75% free, and most preferably at least about 90% free from other 
components with which they are naturally associated. 

A "substitution" refers to the replacement of one or more amino acid residues or nucleotides 

10 by different amino, acid residues or nucleotides, respectively, 

"Substrate" refers to any suitable rigid or senu-rigid support including membranes, filters, 
chips, slides, wafers, fibers, magnetic or nonmagnetic beads, gels, tubing, plates, polymers, 
microparticles and capillaries. The substrate can have a variety of surface forms, such as wells, 
trenches, pins, channels and pores, to which polynucleotides or polypeptides are bound. 

15 . . A *■ transcript image" or "expression profile" refers to the collective pattern of gene 
expression by a particular cell type or tissue under given conditions at a given time. 

*Transformatiofn" describes a process by which exogenous DNA is introduced into a recipient 
cell. Transformation may occur under natural or artificial conditions according to various methods 
well known in the art, and may rely on any known method for the insertion of foreign nucleic acid 

20 sequences into a prokaryotic or eukaryotic host cell. The method for transformation is selected based 
on the type of host cell being transformed and may include, but is not limited to, bacteriophage or 
viral infection, electroporation, heat shock, lipofection, and particle bombardme^nt The term 
"transformed cells** includes stably transformed cells in which the inserted DNA is capable of 
replication either as an autonomously replicating plasmid or as part of the host chromosome, as well 

25 as transiently transformed cells which e?q>re5S the inserted DNA or RNA for limited periods of time. 
A "transgenic organism,** as used herein, is any organism, including but not limited to 
animals and plants, in which one or more of the cells of the orgcunism contains.heterologQus-nu6leic • 
- • acid introduced by way of human intervention, such as by transgenic techniques well known in the 
art. The nucleic acid is introduced into the cell, directly or indirectly by introduction into a precursor 

30 of the cell, by way of deliberate genetic manipulation, such as by microinjection or by infection with 
a recombinant virus. In another embodiment, the nucleic acid can be introduced by infection with a 
recombinant viral vector, such as a lenti viral vector (Lois, C et al. (2002) Science 295:868-872). The • 
term genetic manipulation does not include classical cross-breeding, or in vitro fertilization, but 
mther is directed to the introduction of a recombinant DNA molecule. The transgenic or^ganisms 

35 contemplated in accordance with the present invention include bacteria, cyanobacteria, fungi, plants 
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and animals. The isolated DNA of the present invention can be introduced into the host by methods 
known in the art, for example infection, transfection, transformation or transconjugation. Techniques 
for transferring the DNA of the present invention into such organisms ate widely known and 
provided in references such as Sambrook and Russell (supra). 

5 A 'Variant" of a particular nucleic acid sequence is defined as a nucleic acid sequence having 

at least 40% sequence identity to the particular nucleic acid sequence over a certain length of one of 
the nucleic acid sequences using blastn with'the *T8LAST 2 Sequences" tool Version 2.0.9 (May-07- 
1999) set at default parameters. Such a pair of nucleic acids may show, for example, at least 50%, at 
least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 

10 93%, at least 94%, at least 95%, at least 96%, at least 97%. at least 98%, or at least 99% or greater 
sequence idendty over a certain defined lengOi. A variant may be described as, for example, an 
**allelic" (as defined above), "splice," "species," or ^polymorphic" variant. A splice variant may have 
significant identity to a reference molecule, but will generally have a greater or lesser nuniber of 
polynucleotides due to alternate splicing during mRNA processing. The corresponding polypeptide 

15 may possess additional functional donaains or lack donmns thai are present in the referee 
• Species variants are polynucleotides that vary from one species to another. The resulting 
polypeptides will generally have significant amino acid identity relative to each other. A 
polymorphic variant is a variation in the polynucleotide sequence of a particular g^ne between 
individuals of a given species. Polymoq^hic variants also may encompass "single nucleotide 

20 polymorphisms" (SNPs) in which the polynucleotide sequence varies by one nucleotide base. The 
presence of SNPs may be indicative of, for exanq)le, a certain population, a .disease state, or a 
propensity, for a disease state. 

A 'Variant" of a particular polypeptide sequence is defined as a polypeptide sequence having 
. at least 40% sequence identity or sequence similarity to the particular polypeptide sequence over a 

25 . certain lengfli of one of ttie polypeptide sequences using blastp with the "BLAST 2 Sequences" tool 
Version 2.0.9 (May-07-1999) set at default parameters. Such a pair of polypeptides may show, for 
example, at least 50%, at least 60%. at least 70%, at least 80%,,aUeast-85%rat leasr90%, af leasf 
r- '91%, at least 92%, at least 93%^ arieast 94%, at least 95%, at least 96%, at least 97%, at least 98%, 
. or at least 99% or greater sequence identity or sequence similarity over a certain defined length of one 

30 of the polypeptides. * 

. THE INVENTION - • 

Various embodiments of the invention include new human kinases and phosphatases (KPP), 
the polynucleotides encoding KPP, and the use of these compositions for the diagnosis, treatment, or 
35 prevention of cardiovascular diseases, immune system disorders, neurological disorders, disorders 
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affecting growth and development, lipid disorders, eel! proliferative disorders, and cancers. 

Table 1 summarizes the nomenclature for the ftiU length polynucleotide and polypeptide 
embodiments of the invention. Each polynucleotide and its corresponding polypeptide are correlated 
to a single Incyte project identification number (Incy te Project ID). Each polypeptide sequence is 
5 denoted by both a polypeptide sequence identification number (Polypeptide SEQ ID NO:) and an 
Incyte polypeptide sequence number (Incyte Polypeptide ID) as shown. Each polynucleotide 
sequence is denoted by both a polynucleotide sequence identification number (Polynucleotide SEQ 
ID NO:) and an Incyte polynucleotide consensus sequence number (Incyte Polynucleotide ID) as 
shown. 

10 Table 2 shows sequences with homology to polypeptide embodiments of the invention as 

identified by BLAST analysis against the GenBank protein (genpept) database and the PROTEOME 
database. Columns 1 and 2 show the polypeptide sequence identification number (Polypeptide SEQ 
ID NO:) and the corresponding Incyte polypeptide sequence number (facyte Polypeptide ID) for 
polypeptides of the invention. Column 3 shows the GenBank identification nmiiber (GenBank ID 

15 NO:) of the nearest (jenBank homolog and the PROTEOME database identification numbers 
(PROTEOME ID NO:) of the nearest PROTEOME database homologs. Column 4 shows the 
probability scores for the matches between each polypeptide and its homolog(s). Column 5 shows.the 
annotation of the GenBank and PROTEOME database homolog(s) along with relevant citations 
where applicable, all of which are expressly incorporated by reference herein. 

20 Table 3 shows various structural features of the polypeptides of the invention. Columns 1 

and 2 show the polypeptide sequence identification number (SEQ ID NO:) and the corresponding 
Incyte polypeptide sequence number (Incyte Polypeptide ID) for each polypeptide of the invention; 
Ck>lumn 3 shows the number of amino acid residues in each polypeptide. Column 4 shows amino 
acid residues conqirising signature sequences, domains, motifs, potential phosphorylation sites, and: 

25 potential glycosylation sites. Column 5 shows analytical methods for protein structure/function 
analysis and in some cases, searchable databases to which the analytical methods were applied. 

Together, Tables 2 and 3 summarize the properties of^golyp^tidpspf the. ii^^^ 
properties estat>lish'that the clamed polypeptides are kinases and phosphatases. For example, SEQ 
ID NO:3 is 99% identical, from residue Ml to residue K487, to human apyrase (GenBank JD 

30 g4583675) as detennmed by the Basic Local Alignment Search Tool (BLACT). (See Table 2.) The 
BLAST probability score is 3.7e-264, which indicates the probability of obtaining the observed 
• • polypeptide- sequence alignment by chance. SBQ ID NO:3 also has homology to proteins that are . 
localized to the Golgi, have lysosomal/autophagic function, and are apyrase proteins, as determined 
by BLAST analysis using the PROTEOME database. SEQ ID NO:3 also contains a GDA1/CD39 

35 (nucleoside phosphatase family) domain as determined by searching for statistically significant 
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matches in the Mdden Markov model (HM^^Hased PFAM database of conserved proton 
femilies/doraains. (See Table 3.) Data fmm BLIMPS. MOTIFS, and BLAST analyses against the 
PRODOM and DOMO databases, provide further corroborative e>adence that SEQ ID N0:3 is a 
nucleoside phosphatase. SEQ ID NO:l-2 and SEQ ID NO:4-5 were analyzed and annotated in a 
5 similar manner. The algorithms and parameters for the analysis of SEQ ID NOil-5 are described in 
Table?. 

As shown in Table 4, full length polynucleotide embodiments were assembled using cDNA 
sequences or coding (exon) sequences derived from genomic DNA. or any combination of these two 
types of sequences. Column 1 fists the polynucleotide sequence identification number 
10 CPolynucleotide SEQ ID NO:), the corresponding Incyte polynucleotide consensus sequence number 
(Incyte ID) for each polynucleotide of the invention, and the length of each polynucleotide sequence 
in basepairs. Column 2 lists fragments of the polynucleotides which are useful, for example, in 
hybridization or amplification technologies that identify SEQ ID NO:6-10 or that distinguish between 
SEQ ID NO:6-10 and related polynucleoddes. Column 3 shows identification numbers 
IS conespondlng to cDNA sequences, coding sequences (exons) predicted from genomic DNA, and/or 
sequence assemblages comprised of both cDNA and genomic DNA. These sequences were used to 
assemble the fiill length polynucleotide embodiments. Columns 4 and 5 of Table 4 show the 
nucleotide start (5') and stop (3') positions of the cDNA and/or genomic sequences in column 3 
relative to their respective full length sequences. 
20 The identification numbers in Column 3 of Table 4 may refer specificaUy. for example, to 

Incyte cDNAs along with their corresponding cDNA libraries. For example, 5026615F6 is the 
identification number of an Incyte cDNA sequence, and COLCDITOl is the cDNA Uhrary from 
which it is derived. Incyte cDNAs for which cDNA libraries are not indicated were derived from 
pooled cDNA libraries (e.g., 56087809H1). Alternatively, the identification numbers in column 3 
25 may refer to GenBankcDNAs or ESTs(e.g.,g22918955) which contributed to the assenibly of the 

. full length polynucleotides. In addition, the identification numbers in column 3 may identify 

sequences derived from the ENSBMBL (The Sanger Centre, C^ 
- - sequences including the dbsl^iitfon '•irfS'n.'Alteniativejy, the identification numbers in column 3 

may be derived from the NCBI RefSeq Nucleotide Sequence Records Database (i.e.. those sequences 
30 including the designation "MM" or "NTO or the NCBI RefSeq Protein Sequence Recotds ({.e. those 

sequences including the designation "NP"). Alternatively, the identification numbers in column 3 
rnay lefisr to assemblages of both cDNA and Genscan-piedicted exons brought together by an "exon - 

stitching" algorithm. For example. FL JOCXXXX JV, JV^l'n'lTJV.^M, represents a "stitched" 

sequence in which XXXXXX is die identification number of tfie cluster of sequences to which tiie 
35 algorithm was applied, and YYYYY is the number of the prediction generated by the algorithm, and 
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N, Z3_, if present, represent specific exons that may have been manuaUy edited during analysis (See 
Exanqile V). Alternatively, the identification numbns in column 3 may refer to assemblages of 
exons brought together by an "exon-stretching" algorithm. For example. 
^IXXXXXXji/iAAAA^BBBBJiJi is the identification number of a "stretched" sequence, with 

5 XXXXKK being the Incyte project identification number. gAAA4A bdng the GenBank identification 
number of the human genomic sequence to which the "exon^tretching" algorithm vi«s applied. 
^BBBB being the GenBank identification number or NCBl RefSeq identification number of the 
nearest GenBank protein homolog, and JV referring to specific exons (See Example V). In instances 
where a RefSeq sequence was used as a protein homolog for tiie "exon-stretching" algorithm, a 

10 RefSeq identifier (denoted by "NM." "NP." or "NT*) may be used in place of ti>e GenBank identifier 
(£.«., ^BBBB). 

Alternatively, a prefix identifies component sequences tfiat were hand-edited, predicted from 
genomic DNA sequences, or derived from a combination of sequence analysis methods. The 
foUowing Table lists examples of component sequence prefixes and corresponding sequence analysis 



IS 



20 



Prefix 


Type of analy as and/or examples of programs 


GNN.OFG. 
ENST 


Exon prediction from genomic sequences using, for example, 
GENSCAN (Stanford University, C A, USA) or FGENES 
(Coirputer Genomics Group, The Sapger Centre, Cambridge, UK). 


OBI 


Hand-edited analysis of genomic sequences. 


EL 


Stitched or stretched genomic sequences (see Example V). 




Full length transcript and exon prediction from mapping of EST 
sequences to the genome. Genqmic location and EST conq?osition 
data are combined to predict the exons and resulting transcript. 
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In some cases. Incyte cDNA coveragp redundant with the sequOTce^overa^sto>Am in Table 
.4.was.x)btained to confirm theiinal ctmsefeus-poryfiQeletigffe sei'wJi^ blit the relevant Incyte cDNA 
identification mmibers are not shown. 

Table 5 shows the representative cDNA libraries for those full length polynucleotides which 
were assembled using Incyte cDNA sequences. The representative cDNA library is the Incyte cDNA 
UbOity which is most frequently represented by the Incyte cDNA sequences which were used to 
assemble and confirm the above polynucleotides. The tissues and vectors which were used to 
30 craistnict die cDNA libraries shown in Table 5 are described in Table 6. 

Table 8 shows single nucleotide polymorphisms (SNPs) found in polynucleotide sequences of 
the invention, along with allele frequencies in different human populations. Columns 1 and 2 show 
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the polynucleotide sequence identification number (SEQ ID NO:) and the coiresponding Incyte 
project identification number (PID) for polynucleotides of the invention. Column 3 shows tiie Incyte 
identification number for the EST in which tiie SNP was detected (EST ID), and column 4 shows Uie 
identification number for the SNP (SNP ID). Column 5 shows the position within the EST sequence 
5 at which the SNP is located (EST SNP). and column 6 shows tiie position of tiie SNP witiiin tiie fiill- 
lengtii polynucleotide sequence (CB 1 SNP). Column 7 shows tiie allele found in tfie EST sequence. 
Columns 8 and 9 show die two alleles found at tiie SNP site. Column 10 shows the amino acid 
encoded by tfie codon including tiie SNP site, based upon the allele found in tiie EST. Columns 1 i- 
14 show the frequency of allele 1 in four different human populations. An entiy of n/d (not detected) 
10 indicates that die frequency of allele 1 In the population was too low to be detected, while n/a (not 
available) indicates tiiat the allele frequency was not determined for die population . 

The invention also encompasses KPP variants. Various embodiments of KPP variants can 
have at least about 80%. at least about 90%. or at least about 95% amino acid sequence identity to the 
KPP amino acid sequence, and can contain at least one functional or structiiral characteristic of KPP. 
15 Various embodiments al$o encompass polynucleotides which encode KPP. In a particular 

embodiment, tiie invention encompasses a polynucleotide sequence comprising a sequence selected 
fromtiiegroupconsistingof SEQIDNO:6-10, which encodes KPP. The polynucleotide sequences 
of SEQ ID NO:6-10. as presented in die Sequence Listing, embrace tfie equivalent RNA sequences, 
wherein occurrences of die nitrogenous base thymine aie replaced widi uracil, and die sugar 
20 backbone is conq)osed of ribose instead of deoxyribose. 

The invention also encompasses variants of a polynucleotide encoding KPP. In particular, 
such a variant polynucleotide will have at least about 70%. or alternatively at least about 85%. or ' 
even at least about 95% polynucleotide sequence identity to a polynucleotide encoding KPP. A 
particular aspect of tiie invention encompasses a variant of a polynucleotide comprising a sequence 
25 selected from die group consisting of SEQ ID NO:6-10 which has at least about 70%. or alternatively 
at least about 85%. or even at least about 95% polynucleotide sequence identity to a nucleic acid 

variants desaibed above can encode a polypeptide which contains at least one functional or structural 
characteristic of KPP. 

30 In addition, or in die alternative, a polynucleotide variant of tiie invention is a splice variant 

of a polynucleotide encoding KPP. A spUce variant may have portions which have significant 
seqiience idehtity to a polynucleotide encoding KPP, but will generally have a greatpr or lesser - 
number of nucleotides due to additions or deletions of blocks of sequence arising from alternate 
splicing during mRNA processing. A splice variant may have less dian about 70%. or alternatively 
less than about 60%. or alternatively less dun about 50% polynucleotide sequence identity to a 
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polynucleotide encoding KPP over its entire length; however, portions of the splice variant will have 
at least about 70%, or alternatively at least about 85%, or alternatively at least about 95%, or 
alternatively 100% polynucleotide sequence identity to portions of the polynucleotide encoding KPP. 
Any one of the splice variants described above can encode a polypeptide which contains at leaist one 
5 functional or structural characteristic of KPP. 

It will be appreciated by those skilled in the art that as a result of the degeneracy of the 
genetic code, a multitude of polynucleotide sequences encoding KPP, some bearing minimal 
7 similarity to the polynucleotide sequences of any known and naturally occurring gene, noay be . 

produced. Thus, the invention contemplates each and every possible variation of polynucleotide 
10 sequence that could be made by selecting combinations based on possible codon choices. These 
combinations are made in accordance with the standard triplet genetic code as applied to the 
polynucleotide sequence of naturally occurring KPP, and all such variations are to be considered as 
being specifically disclosed. 

Although polynucleotides which encode KPP and its variants are generally capable of 
15 - hybridising to polynucleotides encoding naturally occurring KPP under appropriately selected 
conditions of stringency, it xmy be ^vantageous to produce polynucleotides encoding KPP or its 
derivatives possessing a substantially different codon usage, e.g., inclusion of non-naturally occurring 
codons. Codons may be selected to increase the rate at which expression of the peptide occurs in a 
particular prokaiyotic or eukaiyotic host in accordance with the frequency with which particular 
20 codons are utilized by the host Other reasons for substantially altering the nucleotide sequence 
. encoding KPP and its derivatives without altering the encoded amino acid sequences include the 
production of RNA transcripts having more desirable properties, such as a greater half-life, than 
transcripts produced from the naturally occurring sequence. 

The invention also encompasses production of polynucleotides which encode KPP and KPP 
25 . derivatives, or fragments thereof, entirely by synthetic chemistry. After production, the synthetic 
polynucleotide may be inserted into any of the many available expression vectors and cell systems 

using reagents well known in the art. Moreover, synthetic chenustry may be used to inttoduce^ 

- • - mutations into a polynucleotide encoding KPP or any fragment thereof. 

Embodiments of the invention can also include polynuqleotides that are capable of 
30 hybridizing to the claimed polynucleotides, and, in particular, to those having the sequences shown in 
SEQ ID NO;6-10 and fragments thereof, under various conditions of stringency (Wahl, G.M. and S.L. 
Berger (1987) Methods Enzymol. 152:399-407; Kimmel, A.R. (1987) Methods Enzymol. 152:507^ . 
511). Hybridization conditions, including annealing and wash conditions, are described in 
"Definitions." 

35 Methods for DNA sequencing are well known in the art and may be used to practice any of 
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the embodiments of the invention. The methods may enq>loy such enzymes as the Klenow fragment 
of DNA polymerase I, SEQUENASE (US Biochemical, Cleveland OH), Taq polymerase (Applied 
Biosysteras), thermostable T7 polymerase (Amersham Biosciences, Piscatav/ay NJ), or combinations 
of polymerases and proofreading exonucleases such as those found in the ELONGASE amplification 

5 system (Invitrogen, Carlsbad CA). Preferably, sequence preparation is automated with machines such 
as the MICROLAB 2200 liquid transfer system (Hamilton, Rcto NV), PTC200 thermal cycler (MJ 
Research, Watertown MA) and ABI CATALYST 800 thermal cycler (Applied Biosystems). 
Sequencing, is then carried out using either the ABI 373 or 377 DNA sequencing system (Applied 
Biosystems), the MEGABACE 1000 DNA sequencing system (Amersham Biosciences), or other 

10 . systems known in the art. The resulting sequences are analyzed using a variety of algorithms which 
are well known in the art (Ausubel et al., supra, ch. 7; Meyers, R.A. (199S) Molecular B iolopy and 
Biotechnology. Wiley VCH, New York NY, pp. 856^53). 

The nucleic acids encoding KPP may be extended utilizing a partial nucleotide sequence and 
en^Ioying various PCR-based methods known in the art to detect upstream sequences, such as 

\5 promoters and regulatory elements. For example, one method which may be employed, 

lestriction-site PGR, uses universal and nested primers to amplify unknown sequence from genonuc 
DNA within a cloning vector (Sarkar, G. (1993) VCR Methods Appllc. 2:318-322). Another method, 
inverse PGR, uses primers that extend in divergent directions to amplify unknown sequence from a 
circularized template. The template is derived from restriction fragments comprising a known 

20 genomic locus and surrounding sequences (Triglia, T. et al. (1988) Nucleic Acids Res. 16:8186). A 
thifd method, capture PGR, involves PGR amplification of DNA fragments adjacent to known 
.sequences in human and yeast artificial chromosome DNA (Lagerstrom, M. et al. (1991) PGR 
Methods Applic. 1:111-119). In this method, multiple restriction enzyme digestions and ligations 
may be used to insert an engineered double-stranded sequence into a region of unknown sequence » 
•25 • before performing PGR. Other methods which may be used to retrieve unknown sequences are 

known in the art GParker, J.D. et al. (1991) Nucleic Acids Res. 19:3055-3060). Additionally, one may 
use PGR, nested primers, and PROMOTERFINDER libraries (BD Clontech, Palo Alto CA) to walk 

- — genomic DNA. This ptScfediTre avoids the need to screen libraries and is useful in finding intron/exon 
junctions. For all PGR-based methods, primers may be designed using commercially available 

30 software, such as OLIGO 4.06 primer analysis software (National Biosciences, Plymouth MN) or 
another appropriate program, to be about 22 to 30 nucleotides in length, to have a GG content of 

about 50% or more, and to anneal to the template at temperatures of about 6!S^C Xo ll^'C 

When screening for full length cDNAs, it is preferable to use libraries that have been 
size-selected to include larger cDNAs, In addition, random-primed libraries, which often include 

35 sequences containing the 5* regions of genes, are prefemble for situations in which an oligo d(T) 
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libraiy does not yield a fiill-Iength cDNA. Genomic libraries may be usefiil for extension of sequence 

into S' non-transcribed regulatory regions. 

Capillary electrophoresis systems which are commercially available may be used to analyze 

the size or confirm the nucleotide sequence of sequencing or PGR products. In particular, capillary 
5 sequencing may employ flowable polymers for electrophoretic separation, four difffeient nucleotide- 

specific, laser-stimulated fluorescent dyes, and a charge coupled device camera for detection of the 
. emitted wavelengths. Output/light intensity may be converted to electrical signal using appropriate 

software (e.g., GENOTYPER and SEQUENCE NAVIGATOR. Applied Biosystems). and the entire 
process from loading of samples to computer analysis and electronic data display may be computer 
10. controlled. CapiUary electrophoresis is especially preferable for sequencing small DNA fragments 
which may be present in limited amounts in a particular sample. 

In another embodiment of the invention, polynucleotides or fragments thereof which racode 
KPP may be cloned in refcorabinant DNA molecules that direct expression of KPP, or fragments or 
fimctional equivalents thereof, in appropriate host cells. Due to the inherent degeneracy of the 
genetic code, other polynucleotides which encode substantiaUy thm same or a fonctionally equivalent 
polypeptides may be produced and used to express KPP. 

The polynucleotides of tiie invention can be engineered using methods generaUy known in 
the art in order to alter KPP-encoding sequences for a variety of purposes including, but not limited 
to, modification of tiie cloning, processing, and/or expression of the gene product. DNA shuffling by 
20 random fragmentation and PGR reassembly of gene fragments and syntiietic oligonucleotides may be 
used to engineer the nucleotide sequences. For example, oligonucleotide-mediated site-directed 
mutagenesis may be used to mtroduce mutations that create new resb^iction sites, alter glycosylation 
patterns, change codon preference, produce splice variants, and so forth. 

The nucleotides of tiie present invention may be subjected to DNA shuffling techniques such 
25 as MOLECULARBREEDING (Maxygen Inc.. Santa Clara CA; described in U.S. Patent No. 

5,837,458; Chang. C.-C. et aL (1999) Nat Biotechnol. 17:793-797; Christians, F.C. et al. (1999) Nat 
Blotechnol. 17:259-264; and Crameri, A. et al. (1996) Nat^Bioteclmo£14-315-319) to alter or 
improve the bFolbgical p^cip5RiM of KPP.Vudi as ite orlnzymatic activity or its ability to 

bind to other molecules or compounds. DNA shuffling is a process by which a library of gene 
variants is produced using PCR-mediated recombination of gene fragments. The h^raiy is tfien 
subjected to selection or screening procedures tiiat identify tiiose gene variants witfi die desired 
prdperties. These preferred variants may Uienbe pooled and fiirtfier subjected to recureive rounds of 
DNA shuffling and selection/screening. Thus, genetic diversity is created ttirough "artificial" 
breeding and rapid molecular evolution. For example, fragments of a single gene containing random 
point mutations may be recombined, screened, and tiien reshuffled until the desired properties are 
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optimized. Alternatively. Pigments of a given gene may be recombined with ftagi^ients of 
homologous genes in the same gene family, either from the same or different species, thereby 
maximizing the genetic diversity of multiple naturally occurring genes in a directed and controllable 



tnanner. 



In another embodiment, polynucleotides encoding KPP may be synthesized, in whole or in 
part, using one or more chemical methods well known in the art (Caruthers. M.H. et al. (1980) 
Nucleic Acids Symp. Ser. 7:215-223; Horn, T. et al. (1980) Nucleic Acids Symp. Ser. 7:225-232). 
Alternatively, KPP itself or a fragment thereof may be synthesized using chemical methods known in 
the art For example, peptide synthesis can be performed using various solution-phase or solid-phase 
10 techniques (Ci^ighton. T. (1984) Proteins Smirtures and Molecular Properties , WH Freeman. New 
York NY, pp. 55-60; Roberge. J.Y. et al. (1995) Science 269:202^204). Automated synthesis may be 
achieved using the ABI 431 A peptide synthesizer (AppUed Biosystems). Additionally, the aimno 
acid sequence of KPP, or any part thereof, may be altered during direct synthesis and/or combined 
with sequences from other proteins, or any part thereof, to produce a variant polypeptide or a 
15 polypeptide having a sequence of a naturally occurring polypeptide. 

The peptide may be substantially purified by preparative high performance liquid 
chromatography (Chiez,R.M..andF.Z.Regnier(1990)MethodsEn2ymol. 182:392-421). The 
composition of the synthetic peptides may be confirmed by amino acid analysis ot by sequencing 

(Creighton, supra, pp. 28-53). 
20 In order to express a biologically active KPP, the polynucleotides encoding KPP or 

. derivatives thereof may be inserted into an ^propriate expression vector. i.e.. a vector which contains 
. the necessary elements for transcriptional and translational control of the inserted codmg sequence in 
a suitable host. These elements include regulatory sequences, such as enhancers, constitutive and 
inducible promoters, and 5' and 3' untranslated regions in the vector and in polynucleotides encoding 
25 • KPP. Suchelementsmayvaryintheirstrengthandspecificity. Specific initiation signals may also 
be used toachieve more efficient translation of polynucleotides encodmg KPP. Such signals include 
the ATG initiation codon and adjacent sequences. e.g. the Ko^se^^^ . 
^ polyiiucleotide sequence e^cbii^glffPiiid lis iniWn codon and upstream regulatory sequences 
. . are inserted Into the appropriate expression vector, no additional transcriptional or translational 
30 contolsignalsmaybeneeded. However, in cases where only coding sequence, or a fragment 
thereof, is inserted, exogenous translational control signals including an in-frame ATG initiation 
codon should be provided by the vector. Exogenous translational elements and initiation codons may - 
be of various origins, both natural and synthetic. The efficiency of expression may be enhanced by 
tiie inclusion of enhancers appropriate for the particular host cell system used (Scharf. D. et al. (1994) 
35 Results Ptobl. Cell Differ. 20: 125-162). 
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Methods which are well known to those skilled in the art may be used to construct expression 
vectors containing polynucleotides encoding KPP and appropriate transcriptional and translational 
control elements. These methods include in vitro recombinant DNA techniques, synthetic techniques, 
and in vivo genetic recomWnation (Sambrook and Russell, supra, ch. 1-4. and 8; Ausubel et al., 

5 supra, ch. 1, 3, and 15). 

A variety of expression vector/host systems may be utilized to contain and express 
polynucleotides encoding KPP. These include, but are not limited to. microorganisms such as 
bacteria transformed with recombinant bacteriophage, plasmid, or cosmid DNA expression vectors; 
yeast transformed with yeast expression vectors; insect cell systems infected with viral expression 
10 vectors (e.g., baculovinis); plant cell systems transformed with viral expression vectors (e.g.. 

cauliflower mosaic virus, CaMV, or tobacco mosaic virus. TMV) or with bacterial expression vectors 
(e.g., Ti or pBR322 plasmids); or anhna! cell systenns (Sambiook and Russell, supra; Ausubel et al.. 
supra; Van Heeke. G. and SM. Schuster (1989) J. Biol. Chem. 264:5503-5509;45ngelhaKi, E.K. et aL 
(1994) Proc. Natl. Acad. Sci. USA 91:3224-3227; Sandig. V. et al. (1996) Hum. Gene Then 7:1937- 
15 1945; Takamatsu. N. (1987) EMBO J. 6:307-311; The McGraw Hill Yearbook of Scienpe and 

Technology (1992) McGraw Hill. New York NY. pp. 191-196; Logan, J. and T. Shenk (1984) Proc. 
Natl. Acad. Sci. USA 81.3655-3659; Harrington, J.J. et al. (1997) Nat. Genet. 15:345-355). 
Expression vectors derived firom retroviruses, adenoviruses, or herpes or vaccinia vnruses, or from 
various bacterial plasmids, may be used for delivery of polynucleotides to the targeted organ, tissue. 
20 or cell population (Di Nicola, M. et al. (1998) Cancer Gen. Ther. 5:350-356; Yu, M. et al. (1993) 

Pioc Natl. Acad. Sci. USA 90:6340-6344; Buller. R.M. et al. (1985) Nature 317:813-815; McGregor, 
D.P. et al. (1994) Mbl. Immunol. 31:219-226; Verma, I.M. and N. Somia (1997) Natare 389:239- 
242). The invention is not limited by the host cell employed. 

In bacterial systems, a number of cloning and expression vectors may be selected depending 
25 upon the use intended for polynucleotides encoding KPP. For example, routine cloning, subcloning, 
and pr<^gaHon of polynucleotides encoding KPP can be achieved using a multifunctional R coU 
vector «uch as PBLUESCRIPT (Stratagene. La Jolla CA) ot Pa»ORTljlasmid((hivi^ . 
" " Ligation of polynucleotides encoding KPP into the vector's multiple cloning site disrupts tiie tecZ 

gene, aUowing a colorimetric soeening procedure for identification of transformed bacteria 
30 coniaining recombinant molecules. In addition, these vectors may be useful for in vUro transcription, 
dideoxy sequencing, single stiand rescue witii helper phage, and creation of nested deletions in the 
cloned sequeii66 (Van Heeke, G. and S.M. Schuster (1989) Jv Biol. Chem; 264:5503-5509). When • 
large quantities of KPP are needed. e.g. for tiie production of antibodies, vectors which direct high 
level expression of KPP may be used. For example, vectors containing Oie strong. Inducible SP6 or 
35 T7 bacteriophage promoter may be used. 
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Yeast expression systems may be used for production of KPP. A number of vectors 
containing constitutive or inducible promoters, such as alpha factor, alcohol oxidase, and PGH 
promoters, may be used in the yeast Saccharomyces cerevisiae or Pichia pastoris. In addition, such 
vectors direct either the secretion or intracellular retention of expressed proteins and enable 
5 integration of foreign polynucleotide sequences into the host genome for stable propagation (Ausubel 
et al., supra\ Bitter, G.A. et al. (1987) Methods Enzymol. 153:516-544; Scorer. CA et al. (1994) 
Bio/Technology 12:181-184). 

Plant systems may also be used for expression of KPP. Transcription of polynucleotides 
encoding KPP may be driven by viral promoters, e.g., the 35S and 19S promoters of CaMV used 

10 alone or in combination with the omega leader sequence from TMV (Takamatsu, N. (1987) EMBO J, 
6:307-31 1). Alternatively, plant promoters such as the small subunit of RUBISCO or heat shock 
promoters may be used (Coruzzi, G. et al. (1984) EMBO L 3:1671-1680; Broglie, R. et al. (1984) 
Science 224:838-843; Winter, J. et al. (1991) Results Probl. Cell Differ. 17:85-105), These 
constructs can be introduced into plant cells by direct DNA transformation or pathogen-mediated 

15 transfection (The McChr aw Hill Yearbook of Science and Technology (1992) McCjraw Hill, New 
York NY, pp. 191-196). 

In mammalian cells, a number of viral-based expression systems may be utilized. In cases 
where an adenovirus is used as an expression vector, polynucleotides encoding KPP may be ligated 
into an adenovirus transcription/translation conqilex consisting of the late promoter and tripartite 

20 leader sequence. Insertion in a non-essential El or E3 region of the viral genome may be used to 
obtain infective vkus which expresses KPP in host cells (Logan, J, and T. Shenk (1984) Proc. Natl. 

• Acad. Sci. USA 81:3655-3659). In addition, transcription enhancers, such as the Rous sarcoma virus 
(RSV) enhancer, may be used to increase expression in manunalian host cells. SV40 or EBV-based 
vectors may also be used for high-level protein expression. 

25 Human artificial chromosomes (HACs) may also be employed to deliver larger fragments of 

DNA than can be contained in and expressed from a plasmid. HACs of about 6 kb to 10 Mb are 
constructed and delivered via conventional delivery methods (liposomes, polycationic amino 

- -polymers, or vesicles)'f(M- fheilpeufic puiposes (fiicim^bhrJ J7et'arxT99^^ L5':345^55). 

For long term production of recombinant proteins in mammalian systems, stable expression 

30 of KPP in cell lines is preferred. For example, polynucleotides encoding KPP can be transformed 
into cell lines using expression vectors which may contain viral origins of replication and/or 

• endogenous expression elements and a selectable marker gene on the same or on a separate, vecton 
Following the introduction of the vector, cells may be allowed to grow for about 1 to 2 days in 
enriched media before being switched to selective media. The purpose of the selectable marker is to 

35 confer resistance to a selective agent, and its presence allows growth and recovery of cells which 
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successfully express the introduced sequences. Resistant cicmes of stably transfonned cells may be 
propagated using tissue culture techniques appropriate to the cell type. 

Any number of selection systems may be used to recover transfonned cell lines. These 
include, but are not linuted to, the herpes simplex virus thymidme kinase and adenine 
phosphoribosyltransferase genes, for use in and apt' cells, respectively (Wigler, M. et al. (1977) 
Cell .11:223-232; Lowy, 1. et al. (1980) Cell 22:817-823). Also. antimetaboUte. antibiotic, or 
herbicide resistance can be used as the basis for selection. For example, dhfr confers resistance to 
methotrexate; neo confers resistance to the aminoglycosides neomycin and G-418; and tds and pat 
confer resistance to chlorsulfiiron and phosphinotricin acetyltransferase, respectively (Wigler, M. et 
al. (1980) Proc. Nati. Acad. Sci. USA 77:3567-3570; Colbere-Garapin. F. et al. (1981) J. Mol. Biol. 
150: 1 -14). Additional selectable genes have been described, e.g.. trpB and hisD, which alter ceUular 
requirements for metabolites (Hartman, S.C. and R.C. MulUgan (1988) Proc. Nati. Acad. Sci. USA 
85:8047-8051). Visible markers, e.g., anthocyanins, green fluorescent proteins (GFP; BD Clontech), 
p-glucuronidase and its substrate p-glucuronide, or lucifeiase and its substrate luciferin may be used. 
These markers can be used not only to identify transformants, but also to quantify the amount of 
transient or stable protein expression attributable to a specific vector system (Rhodes, CA. (1995) . 
Methods Mol. Biol. 55: 121-131). 

Although the presence/absence of marker gene expression suggests that the gene of interest is 
also laesent. tiie presence and expression of tiie gene may need.to be confirmed. For exan^le, if the 
sequence encoding KPP is inserted witiiin a marker gene sequence, transformed cells containing 
polynucleotides encoding KPP can be identified by the absence of marker gene function. 
Alternatively, a marker gene can be placed in tandem with a sequence encoding KPP under the 
. control of a single promoter. Expression of the marker gene in response to induction or selection 
usually indicates expression of the tandem gene as well. 
25 In general, host cells diat contain die polynucleotide encodmg KPP and that repress KPP may 

be identified by a variety of procedures known to those of skill in tiie art These procedures include, 
but are not limited to. DNA-DNA or DNA-RNA hybridizations, PGR amptification, and protein 
bioaissay or iriiifnuiioMsay tMhhi 
. for the detection and/or quantification of nucleic acid w protein sequences. 

Inununological methods for detecting and measurhag die expression of KPP using either 
specific polyclonal or monoclonal antibodies are known in tiie art Examples of such techniques 

include enzyme-linked immunosariient assays (ELISAs), radioimmunoassays (RIAs), and 

fluorescence activated cell sorting (FACS). A two^ite, monoclonal-based immunoassay utilizing 
monoclonal antibodies reactive to two non-interfering epitopes on KPP is prefeired, but a competitive 
binding assay may be employed. These and other assays are well known in Uie art (Hampton, R. et al. 
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(^^^> Serological Method?, a Lalyprafayry Mmiwl APS Press. St. Paul MN, Sect IV; CoHgan. J.B. et 
al (1997) CurreptProtocolg ImmunolopY. Greene Pub. Associates and Wiley-Interscience, New 
York NY ; Pound, JX). (1998) Immunochemical Protocols. Humana Press. Totowa NJ). 

A wide variety of labels and conjugation techniques are known by those skilled in the art and 
5 may be used in various nucleic acid and amino acid assays. Means for producing labeled 

hybridization or PCR probes for detecting sequences related to polynucleotides encoding KPP include 
oUgolabeling, nick translation, end-labelmg, or PGR amplificatioi using a labeled nucleotide. 
Alternatively, polynucleotides encoding KPP, or any fragments thereof, may be cloned into a vector 
for the production of an mRNA probe. Such vectors are known in the art, are commercially available, 
10 and may be used to synthesize RNA probes in vitro by addition of an appropriate RNA polymerase 
such as T7, T3, or SP6 and labeled nucleotides. . These procedures may be conducted using a variety 
of commerciaUy available kits, such as those provided by Amersham Bibsciences, Ptomega (Madison 
WI), and US Biochemical. Suitable reporter molecules or labels which may be used for ease of 
detection include radionuclides, enzymes, fluorescent, chemiluminescent. or chromogenic agents, as 
15 well as substrates, cofactors, inhibitors, magnetic particles, and the like. 

Host cells transformed with polynucleotides encoding KPP may be cultured under conditions 
suitable for the expression and recovery of the protein from cell culture. The protein produced by a 
transformed cell may be secreted or retained intracellular^ depending on the sequence and/or the 
vector used. As will be understood by those of skill in the art, expression vectors containing 
20 polynucleotides which encode KPP may be designed to contain signal sequences which direct 
secretion of KPP through a prokaryotic or eukaryotic cell membrane. 

Li addition, a host cell strain may be chosen for its ability to modulate expression of the 
inserted polynucleotides or to process the expressed protein in the desired fashion. Such 
modifications of die polypeptide include, but are not limited to, acetylation, carijoxylation, 
25 glycosylation, phosphorylation, lipidation, atid acylation. Post-banslational processing which cleaves 
a "prepro" or "jpro" form of die protein may also be used to specify protein targeting, folding, and/or 
activity. Different host cells which have specific cellular machinery and characteristic mechanisms 
for post-translationalactivitfes (e.g., GHO, UdAiMkx.7im!^3'a^^ are aC^able fi^m the 
American Type Culture Collection (ATCC, Manassas VA) and may be chosen to ensure die correct 
30 modificati(m and processing of the foreign protein. 

In anodier embodiment of the invention, natural, modified, <»- recombinant polynucleotides 
encoding KPP may be Ugated to a heterologous sequence resulting in translation of a fusion protein in 
any of ttie aforementioned host systems. For example, a chimeric KPP protein containing a 
heterologous moiety that can be recognized by a commercially available antibody may facilitate the 
screening of peptide libraries for inhibitors of KPP activity. Heterologous protein and peptide 
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moieties may also fecilitate purification of fusion proteins using commercially available affinity 
matrices. Such moieties include, but are not limited to. glutathione S-transferase (GST), maltose 
binding protein (MBP), thioredoxin (Tix), calmodulin binding peptide (CBP), 6-His, FLAG, c-myc, 
and hemagglutinin (HA). GST. MBP. Trx. CBP, and 6-His enable purification of their cognate fiisiow 
5 proteins on immobilized glutathione, maltose, phenylarsine oxide, cabnodulin, and metalnihelate 
resins, respectively. FLAG, c-myc, and hemagglutinin (HA) enable immunoaffinity purification ef- 
fusion proteins using commerciaUy available monoclonal and polyclonal antibodies that specifically 
recognize tiiese epitope tags. A fusion protein may also be engineered to contain a proteolytic 
cleavage site located between tiie KPP encoding sequence and the heterologous protein sequence, so 
10 tiiat KPP may be cleaved away from flie heterologous moiety foUowmg purification. Methods for 
fusion protein expression and purification are discussed in Ausubel et al. (supm, ch. 10 and 16). A 
variety of commercially available kits may also be used to feciUtate expression and purification of 
fusion proteins. 

In anotiier embodiment, synthesis of radiolabeled KPP may be achieved in vitro using the 
15 TNT. rabbit reticulocyte lysate or wheat germ extract system (Promega). These systems couple 

transcription and Oanslation of protein-coding sequences operably associated witii Uie T7, T3, or SP6 
promoters. Translation takes place in the presence of a radiolabeled amino acid precursor, for 
example. "S-methionine. 

KPP, fragments of KPP, or variants of KPP may be used to screen for compounds that 
20 specifically bind to KPP. One or more test compounds may be screened for specific binding to KPP. 
In various embodiments. 1, 2. 3. 4. 5, 10. 20. 50, 100. or 200 test compounds can be screened for 
specific binding to KPP. Examples of test compounds can include antibodies, anticalins. 
oligonucleotides, proteins (e.g., ligands or receptors), or small molecules. 

. In related embodiments, variants of KPP can be used to screen for binding of test compounds, 
25 such as. antibodies, to KPP. a variant of KPP. or a combination of KPP and/or one or more variants 
KPP. In an embodiment, a variant of KPP can be used to screen for compounds that bind to a variant 
of KPP, but not to KPP having the exact sequence of a sequence of SEQ ID NO: l:5._ra>p variants 
- used to perforin such screening c^n i^ve a i^g^ 

KPP, with various embodiments having 60%, 70%. 75%. 80%. 85%, 90%. and 95% sequence 
30 identity. 

In an embodiment, a compound identified in a soreen for specific binding to KPP can be 
closely related to the natural ligand of KPP. e.g;; a Hgand or ihigment thereof, a natural substrate, a • • 
structural or functional mimetic, or a natural binding partner (Coligan, J.E. et al. (1991) Cunen^ 
PrgtocQlsinTmrnunolopy l(2):Chapter 5). In anotfier embodiment, the compound thus identified can 
35 be a natural ligand of a receptor KPP (Howard. AJ>. et al. (2001) Trends Pharmacol. Sci.22:132-140; 
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Wise, A. et al. (2002) Drug Discovery Today 7:235-24^. 

In other embodiinents. a c«npound identified in a screen for specific binding to KPP can be 
closely related to the natural receptor to which KPP binds, at least a fragment of the receptor, or a • . 
fragment of the receptor including all or a portion of the ligand binding site or binding pocket For 
5 example, the compound may be a receptor for KPP which is capable of propagating a signal, or a 
decoy receptor for KPP which is not capable of propagating a signal (Ashkenazi, A. and V,M. Divit 
(1999) Curr. Opin. Cell Biol. 11:255-260; Mantovani, A. et al. (2001) Trends Immunol 22:328-336). 
The compound can be rationally designed using known techniques. Examples of such techniques 
include those used to construct the compound etanercept (ENBREL; Amgm lac.. Thousand Oaks 
10 CA). which is efficacious for treating rheumatoid arthritis in humans. Etanercept is an engineered 
p75 tumor necrosis factor (TNF) receptor dimer linked to the Fc portion of human IgG, CTaylor.P.C 
etal. (2001) Curr. Opin. Immunol. 13:611-616). 

In one embodiment, two or more antibodies having similar or, alternatively, different 
specificities can be screened for specific binding to KPP. fragments of KPP, or variants of KPP. The 
binding specificity of the antibodies thus screened can thereby be selected to identify particular 
fragments or variants of KPP. In one embodiment, an antibody can be selected such that its binding 
specificity allows for preferential identification of sp^ific fragments or variants of KPP. In another 
embodiment, an antibody can be selected such that its binding specificity allows for preferential 
diagnosis of a specific disease or condition having increased, decreased, or otherwise abnormal 
2) production of KPP. 

In an embodiment, anticalins can be screened for specific binding to KPP, fragments of KPP, 
or variants of KPP. Anticalins are ligand-binding proteins that have been constructed based on a 
lipocalin scaffold (Weiss, G.A. and H.B. Lowman (2000) Chem. Biol. 7;R177-R184; Skerra, A. 
(2001) J. Biote^hnol. 74:257-275). The protein architecture of Upocalins can include a beta-bairel 
25 having eight antiparallel beta-strands, which supports four loops at its open end. Thesb lodps form 
the natural ligand-binding site of the lipocalins, a site which can be re-engineered in vitro by amino 
acid substitutions to '^P^ novel binding^^ substitutions can be made. . 

' ■ using inetlkxis kiwwii in the art'or describe herein, and can include conservative substitutions (e.g.. 
substitutions that do not alter binding specificity) or substitutions that modestly, moderately, or 
30 significantly alter binding specificity. 

In one embodiment, screening for compounds which specifically bind to, stimulate, or inhibit 
KPP involves producing (q>propriate ceUs Nvhich express KPP. either as a secreted protein or on the 
cell membrane. Preferred cells can include ceUs from mammals, yeast, Drosophila, or E. coli. Ctells 
expressing KPP or cell membrane fractions which contain KPP are then contacted with a test 
35 compound and binding, stimulation, or inhibition of activity of either KPP or the compound is 
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analyzed. 

An assay may simply test binding of a test compound to the polypeptide, wherein binding is 
detected by a fluorophore, radioisotope, enzyme conjugate, or other detectable label. For example, 
the assay may comprise the steps of combining at least one test compound with KPP. either in 
solution or affixed to a solid support, and detecting the binding of KPP to the compound. 
Alternatively, the assay may detect or measure binding of a test compound in the presence of a 
labeled competitor. Additionally, the assay may be carried out using cell-free preparations, chemical 
Ubraries, or natural product mixtures, and the test compound(s) may be free in solution or affixed to a 
solid support. 

An assay can be used to assess the ability of a compound to bind to its natural ligand and/or 
to inhibit the binding of its natural ligand to its natural receptors. Examples of such assays include 
radio-labeling assays such as those described in U.S. Patent No. 5,914,236 and U.S. Patent No. 
6,372,724. In a related embodiment, one or more amino acid substitutions can be introduced into a 
polypeptide compound (such as a receptor) to iipprove or alter its ability to bmd to its natural ligands 
(Matthews, DJ. and J.A. Wells. (1994) Chem. Biol. l:25-30>. In another related embodiment, one or 
"more amino acid substitutions can be introduced into a polypeptide compound (such as a ligand) to 
improve or alter its ability to bind to its natural receptors (Cunningham, B.C. and J.A. Wells (1991) 
Proc. Natl. Acad. Sci. USA 88:3407-3411; Lowman, H.B. et al. (1991) J. Biol. Chem. 266:10982- 
10988). 

KPP, fragments of KPP, or variants of KPP may be used to screen for compounds that ' 
modulate the activity of KPP. Such compounds may include agonists, antagonists, or partial or 
inverse agonists. In one embodiment, an assay is performed under conditions permissive for KPP 
activity, wherein KPP is combined with at least one test compound, and the activity of KPP in the 
presence of a test conq)ound is compared with the activity of KPP in the absence of the test 
conqjound. A change in the activity of KPP in the presence of the test compound is indicative of a 
compound that modulates the activity of KPP. Alternatively, a test compound is combined with an in 
vitro or cell-free system comprising KPP under conditions suitable for KP P acti vity, and the assay is 
performed. In either of these assays, a test compound which modulates the activity of KPP may do so 
indirectly and need not come in direct contact with the test compound. At least one and up to a 
plurality of test compounds may be screened. 

In another embodiment, polynucleotides encoding KPP or their mammalian homologs may be 
'^knocked out" in ah animal model system using homolbgdus recbmbination in embryonic stem (ES) 
cells. Such techniques are well known in the art and are useful for the generation of animal models of 
human disease (see, e.g.. U.S. Patent No. 5.175,383 and U.S. Patent No. 5,767.337). For example, 
mouse ES cells, such as the mouse 129/SvJ cell line, are derived from the early mouse embryo and 
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grown in culture. The ES cells are transformed with a vector containing the gene of interest disrupted 
by a marker gene. e.g., the neomycin phosphotransferase gene (neo; Capecchi, MR. (1989) Science 
244:1288-1292). The vector integrates into the corresponding region of the host genome by 
homologous recombination. Alternatively, homologous recombination takes place using the Cre-loxP 
system to knockout a gene of interest in a tissue- or developmental stage-specific manner (Marth, J J). 
(1996) Clin. Invest. 97:1999-2002; Wagner. K.U. et al. (1997) Nucleic Acids Res. 25:4323-4330). 
Transformed ES cells are identified and microinjected into mouse cell blastocysts such as those from 
the C57BL/6 mouse strata. The blastocysts are surgically transferred to pseudopregnant dams, and 
the resulting chimeric progeny are genotyped and bred to produce heterozygous or homozygous 
strains. Transgenic animals thus generated may be tested with potential therapeutic or toxic agents. 

Polynucleotides encoding KPP may also be manipulated in vitro in ES cells derived from 
human blastocysts. Human ES cells have the potential to differentiate into at least eight separate cell 
lineages including endoderm, mesoderm, and ectodermal cell types. These cell lineages differentiate 
into, for example, neural cells, hematopoietic lineages, and cardiomyocytes (Thomson. J. A. et al. 
15 (1998) Science 282: 1 145-1 147). 

Polynucleotides encoding KPP can also be used to create "knockin" humanized animals 
(pigs) or transgenic animals (mice or rats) to model human disease. With knockin technology, a 
region of a polynucleotide encoding KPP is mjected into animal ES cells, and the injected sequence 
integrates into the animal cell genome. Transfomned cells are injected into blastulae. and the 
blastulae are implanted as described above. Transgenic progeny or mbied lines are studied and 
treated with potential phannac«itical agents to obtain information on treatment of a human disease. 
Alternatively, a mammal inbred to overexpress KPP. e.g.. by secreting KPP in its milk, may also 
serve as a convenient source of that protein (Janne, J. et al. (1998) Biolechnol. Annu. Rev. 4:55-74). 
THERAPEUTICS 

Chemical and structural similarity, e.g., in the context of sequences and motifs, exists 
between regions of KPP and kinases and phosphatases. In addition, examples of tissues expressing 
KPP can be found in Table 6 and can ako be found in Example XL Therefore. KPP appears to play a 
role in cardiovascuUir diseases, immuiie-systdn di^oTdfers, heuroto^^cal diJorf^7dii^«s ^«:ting 
growth and development, lipid disorders, cell proliferative disorders, and cancers. In tiie treatment of 
disorders associated witfi increased KPP expression or activity, it is desirable to decrease die 
expression or activity of KPP. In die tieatment of disorders associated with decreased KPP 

expression or activity, it is desirable to mcrease the expression or activity of KPP. 

Therefore, in one embodiment. KPP or a fragment or derivative thereof may be administered 
to a subject to treat or prevent a disorder associated witii decreased expression or activity of KPP. 
Examples of such disorders include, but are not limited to, a cardiovascular disease such as 
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arteriovenous fistula, atherosclerosis, hypeitensioD, vasculitis, Raynaud's disease, aneurysms, arterial 
dissections, varicose veins, thrombophlebitis and phlebothrombosis, vascular tumors, and 
complications of thrombolysis, balloon angioplasty, vascular replacement, and coronary artery bypass 
graft surgery, congestive heart failure, ischenuc heart disease, angina pectoris, myocardial infarction, 

5 hypertensive heart disease, degenerative valvular heart disease, calcific aortic valve stenosis, 

congenitally bicuspid aortic valve, mitral annular calcification, mitral valve prolapse, rheumatic fever 
and rheumatic heart disease, infective endocarditis, nonbacterial thrombotic endocarditis, endocarditis 
of systemic lupus erythematosus, carcinoid heart disease, cardiomyopathy, myocarditis, pericarditis, . 
neoplastic heart disease, congenital heart disease, and complications of cardiac tiBnsplantation, 

10 congenital lung anomalies, atelectasis, pulmonary congestion and edema, pulmonary embolism, i 
pulmonary hemorrhage, pulmonary infarction, pulmonary hypertension, vascular sclerosis, 
obstructive pulmonary disease, 'restrictive pulmonary disease, chronic obstructive pulmonary disease, 
emphysema, chronic bronchitis, bronchial asthma, bronchiectasis, bacterial pneumonia, viral and 
mycoplasmal pneumonia, lung abscess, pulmonary tuberculosis, diffuse interstitial diseases, 

IS pneumoconioses, sarcoidosis, idiopathic pulmonary fibrosis, desquamative interstitial pneumonitis, 
hypersensitivity pneumonitis, pulmonary eosinophilia bronchiolitis obliterans-organizing pneumonia, 
diffuse pulmonary hemoirhage syndromes, Goodpasture's syndromes, idiopathic pulmonary • • 

hemosiderosis, pulmonary involvement in collagen-vascular disorders, pulmonary alveolar 
proteinosis, lung tumors, inflammatory and noninflammatory pleural effusions, pneumotiiorax, 

20 pleural tumors, drug-induced lung disease, radiation-induced lung disease, and complications of lung 
transplantation; an immune system disorder such as acquired immunodeficiency syndrome (AIDS), 
Addison* s diusease, adult respiiratory distress syndrome, allergies, anl^losing spondylitis, amyloidosis, ' . 
anemia, asthma, atherosclerosis, autoimmune hemolytic anemia, autoimmune thyroiditis, autoinomune . 
polyendocrinopatby-candidiasis-^todermal dystrophy (APECED), bronchitis, cholecystitis, contact 

25 dermatitis, Crohn's disease, atopic dermatitis, dermatomyositis, diabetes mellitus, emphysema, 

episodic lymphopenia with lymphocytotoxins, erythroblastosis fetalis, erythema nodosum, atrophic 

gastritis, glomerulonephritis, Goodpasture's syndrome, gout. Graves* disease, Hashimoto's . ^ ^ 

. thyroiditis, hypereosinophilia; irritable bdwef syndrbme, multiple sclerosis, myasthenia gravis, 
myocardial or pericardial inflammation, osteoarthritis, osteoporosis, pancreatitis, polymyositis, 

30 psoriasis, Reiter's syndrome, rheumatoid arthritis, scleroderma, Sjbgren's syndrome, systemic 

anaphylaxis, systemic lupus erythematosus, systemic sclerosis, thrombocytopenic purpura, ulcerative 

colitis, uveitis,. Werner syndrome, complications of cancer, hemodialysis, and extracorporeal 

circulation, viral, bacterial, fungal, parasitic, protozoal, and helminthic infections, and trauma; a 
neurological disorder such as epilepsy, ischemic cerebrovascular disease, stroke, cerebral neoplasms; 

35 Alzheimer's disease. Pick's disease, Huntington's disease, dementia, Parkinson's disease and other 



54 



pp.1688 P 

extrapyramidal disorders, amyotrophic lateral sclerosis and other motor neuron disorders, progressive 
neural muscular atrophy, retinitis pigmentosa, hereditary ataxias, multiple sclerosis and other 
demyelinating diseases, bacterial and viral meningitis, brain abscess, subdural empyema, epidural 
abscess, suppurative intracranial thrombophlebitis, myelitis and radiculitis, viral central nervous 

5 . system disease, prion diseases including kuru, Creutzfeldt-Jakob disease, and Gerstmann- 

Straussler-Scheinker syndrome, fatal familial insomnia, nutritional and metabolic diseases of the 
nervous system, neurofibromatosis, tuberous sclerosis, cerebelloretinal hemangioblastomatosis, 
encephalotiigeminal syndrome, mental retardation and other developmental disorders of the central 
nervous system including Down syndrome, cerebral palsy, neuroskeletal disorders, autonomic 

10 nervous system disorders, cranial nerve disorders, spinal cord diseases, muscular dystrophy and other 
neuromuscular disorders, peripheral nervous system disorders, dmnatomyositis and polymyositis, 
inherited, metabolic, endocrine, and toxic myopathies, myasthenia gmvis, periodic paralysis, mental 
disorders including mood, anxiety, and schizophrenic disorders, seasonal affective disorder (SAD), 
akathesia, amnesia, catatonia, diabetic neuropathy, tardive dyskinesia, dystonias, paranoid psychoses, . 

15 postherpetic neuralgia, Tourette*s disorder, {nrogressive supranuclear palsy, corticobasal degeneration, . 
and familial firontotenqporal dementia; a disorder affecting growA and development such as actinic 
keratosis, arteriosclerosis, atherosclerosis, bursitis, cirrhosis, hepatitis, mixed connective tissue 
disease (MCTD), myelofibrosis, paroxysmal nocturnal hemoglobinuria, polycythemia vera, psoriasis, 
primary thrombocythemia, renal tubular acidosis, anenua. Gushing' s syndrome, achondroplastic 

20 dwarfism, Duchenne and Becker muscular dystrophy, epilepsy, gonadal dysgenesis, WAGR 
syndrome (Wilms' tumor, aniridia, genitourinary abnormalities, and mental retardation), Smith- 
Magenis syndrome, myelodysplastic syndrome, hereditary mucoepithelial dysplasia, hereditary 
keratodermas, hereditary neuropathies such as Charcot-Marie-Tooth disease and neurofibromatosis, 
hypothyroidism, hydrocephalus, seizure disorders such as Syndenham's chorea and cerebral palsy, 

25 spina bifida, anencephaly, craniorachischisis, congenital glaucoma, cataract, and sensorineural 
hearing loss; a lipid disorder such as fatty liver, cholestasis, primary biliary cirrhosis, carnitine 

deficiency, carnitine palmitoyltransferase deficiency, myoadenylate deaminase deficiency, 

- 'hypertriglyceridemia; lipid storage disordef-f such Fabry's disease, Gaucher' s disease, Niemann- 
Pick's disease, metachromatic leukodystrophy, adrenoleukodystrophy, GMj gangliosidosis, and 

30 ceroid lipofuscinosis,, abetalipoproteinemia, Tangier disease, hyperlipoproteinemia, diabetes mellitus, - 
lipodystrophy, lipomatoses, acute panniculitis, disseminated fat necrosis, adiposis dolorosa, lipoid 
adrenal hyperplasia, minimal change disease, .lipomas, atherosclerosis,. hypercholesterolemia,.. ^ 
hypercholesterolemia with hypertriglyceridemia, primary hypoalphaiipoproteinemia, hypothyroidism, 
renal disease, liver disease, tecithinxholesterol acyltransferase deficiency, cerebrotendinous 

35 xanthomatosis, sitosterolemia, hypocholesterolemia, Tay-Sachs disease, Sandhoff s disease. 
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hyperlipidemia. hyperlipemia, lipid myopathies, and obesity; and a cell proliferative disorder such as 
actinic keratosis, arteriosclerosis, atheioscletosis. buisitis. cirrhosis, hepatitis, mixed connective 
tissue disease (MCTD). myelofibrosis, paroxysmal nocturnal hemoglobinuria, polycythemia vera, 
psoriasis, primary thrombocythemia, and cancers including adenocarcinoma, leukemia, lymphoma,. 
5 melanoma, myeloma, sarcoma, teratocareinoma, and, in particular, cancers of the adrenal gland, 
bladder, bone, bone marrow, brain, breast, cervix, colon, gall bladder, ganglia, gastrointestinal tract, 
heart, kidney, liver. lui<g, muscle, ovary, pancreas, parathyroid, penis, prostate, saUvary glands, sWn. 
spleen, testis, thymus, thyroid, uterus, leukemias such as multiple myeloma, and lymphomas such as 
Hodgkin's disease. ' 

10 fo another embodiment, a vector.capable of expressmg KPP or a fragment or derivative 

Aereof nmy be administ^ to a subject to treat or prevent a disorder associated with decreased 
expression or activity of KPP including, but not linuted to, those described above. 

la a fiuther embodiment, a composition comprising a substantially purified KFP in 
conjunction wiUi a suitable pharmaceutical carrier may be administered to a subject to treat or prev«aat 

15 a disoider associated with decreased expression Or activity of KPP including, but not limited to, those 
■provided above. 

In still anotiier embodiment, an agonist which modulates the activity of KPP nmy be 
adnunisteied to a subject to treat or prevent a disorder associated witii decreased expression or 
activity of KPP including, but not limited to, those Bsted above. 

20 a further embodiment, an aiitagonist of KH» may be administered to a subject to treat or 

prevent a disorder associated with increased expression or activity of KPP. Examples of such 
disorders include, but are not limited to, those cardiovasculat diseases, immune system disorders, 
neurological disorders, disorders affecting growth and development, lipid disorders, cell proliferative 
disorders, and cancers described above. In one aspect, an antibody >vhich specifically binds KPP may 

25 be used directiy as an antagonist or indirectiy as a targeting or deUvery mechanism for bringing a 

pharmaceutical agent to cells or tissues which express KPP. 

In an additional emboiUment, a vector expressing the con^lement of the potynucleotide _ 
' cShcoaing KPP may be adnnnistei^ to a subject to treat or prevent a disorder associated with 
increased expression or activity of KPP including, but not limited to. those described above. 

30 In otiier embodiments, any protein, agonist, antagonist, antibody, complementary sequence, 

or vector embodiments may be administered in combination with other appropriate therapeutic 
agents. Selection of the appropriate agents for use in combination therapy may be made by one of • • 
ordinary skill in the art. according to conventional pharmaceutical principles. The combination of 
therapeutic agents may act synergistically to effect the treatment or prevention of Uie various 

35 disoidere described above. Using this approach, one may be able to achieve tiierapeutic efficacy witii 
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lower dosages of each agent, thus reducing the potential for adverse side effects. 

An antagonist of KPP may be produced using methods which are generally known in the art. 
- In particular, purified KPP may be used to produce antibodies or to screen libraries of pharmaceutical 
agents to identify those which specifically bind KPP. Antibodies to KPP may also be generated using 

5 methods that are well known in the art Such antibodies may include, but are not limited to, 
polyclonal, monoclonal, chimeric, and single chain antibodies. Fab fragments, and fragments 
produced by a Fab expression library. In an embodiment, neutralizing antibodies (i.e., those which 
inhibit dimer formation) can be used therapeutically. Single chain antibodies (e.g., from camels or 
llamas) may be potent enzyme inhibitors and may have application in the design of peptide mimetics, 

10 and in the development of immuno-adsorbents and biosensors (Muyldermans, S. (2001) J. Biotechnol. 
74:277-302). 

For the production of antibodies, various hosts including goats, rabbits, rats, mice, camels, 
dromedaries, llamas, humans, and others may be immunized by injection with KPP or with any 
fragnoent or oligopeptide thereof which has immunogenic properties. Depending on the host species, 
IS. various adjuvants may be used to increase immunological response. Such adjuvants include, but are 
not limited to, Freimd's, mineral gels such as aluminum hydroxide, and surface active substances such 
^ lysolecithin, pluronic polyols, polyanions, peptides^ oil emulsions, KLH, and dinitrophenoL 
Among adjuvants used in humans, BCG (bacilli Calmette-Guerin) and Corynebacterium parvum are 
especially preferable. 

20 It is preferred that the oligopeptides, peptides, or fragments used to induce antibodies to KPP 

have an amino acid sequence consisting of at least about 5 amino acids, and generally will consist of 
at least about 10 amino acids. It is also preferable that these oligopeptides, peptides, or fragments are 
substantially identical to a portion of the amino acid sequence of the natural protein. Short stretches 
of KPP amino acids may be fused with those of another protein, such as KLH, and antibodies to the 

25 , chimeric molecule may be produced. 

Monoclonal antibodies to KPP may be prepared using any technique which provides for the 
production of antibody molecules by continuous cell lines in culture. These include, but £ue not , . . 

^ . . limited to^ the hybridoma t^hnique,'the fiumah B-cell hybridoma technique, and the EBV-hybridoma 
technique (Kohler, G. et al. (1975) Nature 256:495-497; Kozbor, D. et al. (1985) J. Inununol. 

30 Methods 81:31-42;,Cote, R.J, et al. (1983) Proc. Natl. Acad. Sci. USA 80:2026-2030; Ck>le, S.P. et al. 
(1984) Md. Cell Biol. 62:109-120). 

In addition,>techniques developed for the production of "chimeric antibodies,!' such as the 
splicing of mouse antibody genes to human antibody genes to obtain a molecule with appropriate 
antigen specificity and biological activity, can be used (Morrison, S.L. et al. (1984) Proc. NatL Acad. 

35 Sci. USA 81:6851-6855; Neuberger, M.S. et al. (1984) Nature 312:604^508; Takeda, S. et al. (1985) 
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Nature 314:452-454). Alternatively, techniques described for the production of single chain 
antibodies may be adapted, using methods known in the art, to produce KPP-specific single chain 
antibodies. Antibodies with related specificity, but of distinct idiotypic composition, may be 
generated by chain shuffling from random combinatorial immunoglobulin libraries (Burton, D.R. 

5 (1991) Proc. Natl. Acad. Sci. USA 88: 10134-10137). 

Antibodies may also be produced by inducing in vivo production in the lymphocyte 
population or by screening immunoglobulin libraries or panels of highly specific binding reagents as 
disclosed in the literature (Orlandi, R. et al. (1989) Proc. NatK Acad. Sci. USA 86:3833-3837; Winter, 
G. et al. (1991) Nature 349:293-299). 

10 Antibody fragments which contain specific binding sites for KPP may also be generated. For 

exanq)le, such fragments include, but are not limited to, F{ab% fragments produced by pepsin 
digestion of the antibody molecule and Fab fragments generated by reducing the disulfide bridges of 
the F(ab')2 fragments. Alternatively, Fab expression libraries may be constructed to allow rapid and 
easy identification of monoclonal Fab fragments with the desired specificity (Huse, W.D. et al. (1989) 

15 Science 246:1275-1281). 

Various immunoassays may be used for screening to identify antibodies having the desired 
specificity. Numelrous protocols for competitive binding or inununoradiometric assays using either 
polyclonal or monoclonal antibodies with established specificities are well known in the art. Such 
immunoassays typically involve the measurement of conq)lex formadon between KPP and its specific 

20 antibody. A two-site, monoclonal-based immunoassay utilizing monoclonal antibodies reactive to 
two non-interfering KPP epitopes is generally used, but a competitive binding assay may also be 
enqployed (Pound, stqfrd). 

Various methods such as Scatcbard analysis in conjunction with radioimmunoassay 
techniques may be used to assess the affinity of antibodies for KPP. Affinity is expressed as an 

25 association constant, K„, which is defined as the molar concmtration of KPP-antibody complex 
divided by the molar concentrations of free antigen and fiee antibody under equilibrium conditions. 
The K^ determined for a preparation of polyclonal antibodies, which are heterogeneousinjieir^ _ . 
affmities-for multiple KPP epitopes, represents the average affinity, or avidity, of the antibodies for 
KPP. The Ka determined for a preparation of monoclonal antibodies, which are monospecific for a 

30 particular KPP epitope, represents a true measure of affinity. High-affinity antibody preparations 
with Ktt ranging from about 10' to 10" L/mole are preferred for use in inununoassays in which the 
KPP-antibody complex must withstand rigorous aianipulations. .Low-affinity, antibody preparations 
with K^ ranging from about 10^ to 10' L/moIe are preferred for use in umnunopurification and similar 
procedures which ultiraiately require dissociation of KPP, preferably in active form, from the antibody 

35 (Catty, D. (1988) Antibodies. Volume L A Practical Approach. IRL Press, Washington DC; Liddell, 
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JM, and A. Cryer (1991) A Practical Guide to Monoclonal Antibodies. John Wiley & Sons, New 
York NY). 

The titer and avidity of polyclonal antibody preparations may be further evaluated to 
determine the quality and suitability of such preparations for certain downstream applications. For 

5 example, a polyclonal antibody preparation containing at least 1-2 mg specific antibody/ml, 

preferably 5-10 mg specific antibody/ml, is generally employed in procedures requiring precipitation 
of KPP-antibody complexes. Procedures for evaluating antibody specificity, titer, and avidity, and - 
guidelines for antibody quality and usage in various applications, are generally available (Catty, 
supra; Coligan et aL, supra). s 

10 In another embodiment of the invention, polynucleotides encoding KPP, or any fragment or 

complement thereof, may be used for therapeutic purposes. In one aspect, modifications of gene 
expression can be achieved by designing complementary sequences or antisense molecules (DNA, 
RNA, PNA, or modified oligonucleotides) to the coding or regulatory regions of the gene encoding 
KPP. Such technology is well known in the art, and antisense oligonucleotides or larger fragments 

IS can be designed from various locations along the coding or control regions of sequences racoding 
KPP (Agrawal, S., ed. (1996) Antisense Therapeutics. Humana Piess, Totawa NJ). 

In therapeutic use, any gene delivery system suitable for introduction of the antisense 
sequences into appropriate target cells can be used. Antisense sequences can be delivered 
intracellularly in the form of an e?q>iession plasmid which, upon transcription, produces a sequence 
, 20 complementary to at least a portion of the cellular sequence encoding the target protein (Slater, JJB. et 
al. (1998) J. Alleigy Clin. Immunol. 102:469-475; Scanlon, K.J. et al. (1995) FASEB J. 9:128&- 
1296). Antisense sequences can also be introduced intracellularly through the use of viral vectors, 
such'as retrovirus and adeno-associated virus vectors (Miller, A.D. (1990) Blood 76:271-278; 
Ausubel et al., supra; Uckert, W. and W. Walther (1994) Pharmacol. Then 63:323-347)- Other gene 

25 delivery mechanisms include iiposome-derived systems, artificial viral envelopes, and other systems 
known in the art (Rossi, J.J. (1995) Br. Med. Bull. 51:217-225; Boado, RJ. et al. (1998) J. Pharm. 
Sci. 87:1308-1315; Morris, M.C. et al. (1997) Nucleic Acids Res. 25:2730-273^6)., 

~ - r ' ' In another embodiment of the invention, polynucleotides encoding KPP may be used for 

somatic or gemnline gene therapy. Gene therapy may be performed to (i) correct a genetic deficiency . 

30 (e.g., in the cases of severe combined immunodeficiency (SCID)-X1 disease characterized by X- 
linked inheritance (Cavazzana-Calvo, M. et al. (2000) Science 288:669-672), severe combined 
immunodeficiency syndrome- associated- with an inherited adenosine deaminase (ADA) deficiency • • 
(Blaese, R.M. et al. (1995) Science 270:475-480; Boidignon, C et al. (1995) Science 270:470-475), 
cystic fibrosis (Zabner, J. et a!. (1993) Cell 75:207-216; Crystal, R.G. et al. (1995) Hum. Gene 

35 Therapy 6:643-666; Crystal, R.G. et ai. (1995) Hum. Gene Therapy 6:667-703), thalassamias, fanulial 
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hypercholesterolemia, and hemophilia resulting from Factor VBI or Factor IX deficiencies (Crystal, 
R.G. (1995) Science 270:404-410; Verma, I.M. and N. Somia (1997) Nature 389:239-242)), (ii) 
express a conditionally lethal gene product (e.g., in the case of cancers which result from unregulated- 
cell proliferation), or (iii) express a protein which affords protection against intracellular parasites 

5 (e.g., against human retroviruses, such as human Loununodeficiency virus (HTV) (Baltimore, D. 

(1988) Nature 335:395-396; Poeschla, E. et al. (1996) Proc. Natl. Acad. Sci. USA 93:11395-11399), 
hepatitis B or C virus (HBV, HCV); fungal parasites, such as Candida albicans and Paracoccidioides 
brasUiensis; and protozoan parasites such as Plasmodium falciparum and TrypMtosoma cruzi). In the 
case where a genetic deficiency in KPP expression or regulation causes disease, the expression of 

10 KPP from an appropriate population of transduced cells may alleviate the clinical manifestations 
caused by the genetic deficiency. 

In a further embodiment of the invention, diseases or disorders caused by deficiencies in KPP 
are treated by constructing mammalian expression vectors encoding KPP and introducing these 
vectors by mechanical means into KPP-deficient cells. Mechanical transf^ technologies for use with 

15 cells in vivo or. ex vitro include (i) direct DNA microinjection into individual cells, (ii) ballistic gold 
particle delivery, (Hi) liposome-mediated transfection, (iv) receptor-mediated ^ne transfer, and (v) 
the use of DNA transposons (Morgan, R.A. and W.F. Anderson (1993) Annu. Rev. Biochem. 62: 191- 
217; Ivies, Z. (1997) CeU 91:501-510; Boulay, J.-L. and H. R6cipon (1998) Curr- Opin. BiotechnoL 
9:445-450). 

I 

20 Ex:pression vectors diat may be effective for the expression of KPP include, but are not 

limited to, the PCDNA 3.1, BPrTAG, PRCCMV2, PREP, PVAX, PCR2-TOPOTA vectors 
(Invitrogen, Carlsbad CA). PCMV-SCSOPT, PCMV-TAG, PEGSH/PBRV (Stratagene, La JoUa CA). 
and PTBT-OFF, PTBT43N. PTRE2, PTRE2-LUC, PTK-HYG (BD Clontech, Palo Alto CA). KPP 
may be expressed using (i) a constitutively active promoter, (e.g., from cytomegalovirus (CMV), - 

25 Rous sai:coma virus (RSV). SV40 virus, thymidine kinase (TK). or P-actin genes), (ii) an inducible 
promoter (e.g., the tetracycline-regulated promoter (Gossen, M. and H. Bujard (1992) Proc. NatL 
Acad. Sci. USA 89:5547-5551; Gossen, M. et al. (1995) Science 268:1766-1769; Rossi^FM^. and _ 

H.M. Blau (-1998) eurrrOpin:BKa«iabr9:45^^ available in the T-REX plasmid 

(Invitrogen)); the ecdysone-inducible promoter (available in the plasmids PVGRXR and PIND; 

30 Invitrogen); the FKS06/rapamycin inducible promoter, or the RU486/mifepristone inducible promoter 
(Rossi, F.M. V. and RM. Blau, supra)), or (iii) a tissue-specific promoter or the native promoter of 

. the endogenous gene encoding KPP from a normal individual. * 

Commercially available liposome transformation kits (e.g., the PERFECT LIPID 
TRANSFECrriON KIT, available from Invitrogen) allow one with ordinary skill in the art to deliver 

35 polynucleotides to target cells in culture and require minimal effort to optimize experimental 
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parameters. In the alternative, transformation is performed using the calcium phosphate method - 
(Graham, FX,, and AJ. Eb (1973) Virology 52:456467), or by electroporation (Neumann, R et al. 
(1982) EMBO J. 1:841-845). The introduction of DNA to primary cells requires modification of 
these standardized mammalian tcansfection protocols. 

5 In another embodiment of the invention, diseases or disorders caused by genetic defects with 

respect to KPP expression are treated by constructing a retrovirus vector consisting of (i) the 
polynucleotide encoding KPP under the control of an independent promoter or the retrovirus long 
terminal repeat (LTR) promoter, (ii) appropriate RNA packaging signals, and (iii) a Rev-responsive 
element (RRE) along with additional retrovirus ciVacting RNA sequences and coding sequences 

10 required for efficient vector propagation. Retrovirus vectors (e.g., PFB and PFBNEO) are 

commercially available (Stratagene) and are based on published data (Riviere, L et al. (1995) Proc. 
Nafl. Acad. Sci. USA 92:6733*6737), incorporated by reference herein. The vector is propagated in 
an appropriate vector producing cell line (VPCL) that expresses an envelope gene with a tropism for 
receptors on the target cells or a promiscuous envelope protein such as VSVg (Armentano, D. et al. 

15 (1987) J. Virol. 61:1647-1650; Bender, M.A. et al. (1987) J. ViroL 61:1639-1646; Adam, M.A. and 

AD. Miller (1988) 1 Vkol. 62:3802-3806; DuU. T. et al. (1998) J. Virol. 72:8463-8471; Zufferey, R- . 
. etal. (1998) J. Virol. 72:9873-9880). U.S. Patent No. 5,9 10,434 to RiggC*Method for obtaining 
retrovirus packaging cell lines producing hig^ transducing efficiency retroviral supernatant**) 
discloses a method for obtaining retrovirus packaging cell lines and is hereby incorporated by 

20 reference. Propagation of retrovirus vectors, transduction of a population of cells (e.g., CD4* T- 
cells), and the return of transduced cells to a patient are procedures well known to persons skilled in 
the art of gene therapy and have been well documented (Ranga, U. et al. (1997) J. Virol. 71:7020- 
7029; Bauer, G. et al. (1997) Blood 89:2259-2267; Bonyhadi, M.L. (1997) J, Virol. 71:4707-4716; 
Ranga, U. et al. (1998) Proc. Natl. Acad. Sci. USA 95:1201-1206; Su, L. (1997) Blood 89:2283- 

25 2290). 

In an embodiment, an adenovirus-based gene therapy delivery system is used to deliver 
polynucleotides encoding KPP to cells which have one or more genetic abnormalities with respectjo 
^ - the expression of KPP. The constnictidh and plickaging of adenovirus-based vectors are well known 

to those with ordinary skill in the art. Replication defective adenovirus vectors have proven to be 
30 versatile for importing genes encoding immunoregulatory proteins into intact islets in the pancreas 

(Csete, M.E. et al. (1995) Transplantation 27:263-268). Potentially useful adenoviral vectors arc 
. . described in U.S. Patent No..5,707,618 to Armentano ("Adenovkus vectors for gene therapy")* 

hereby incorporated by reference. For adenoviral vectors, see also Antinozzi, P. A. et al. (1999; Annu. 
, Rev. Nutr. 19:5 1 1-544) and Verma, LM. and N. Somia (1997; Nature 18:389:239-242). 
35 In another embodiment, a herpes-based, gene therapy delivery system is used to deliver 
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polynucleotides encoding KPP to target cells which have one or more genetic abnormalities with • 
respect to the expression of KPP. The use of herpes simplex virus (HSV)-based vectors may be 
especially valuable for introducing KPP to cells of the central nervous system, for which HSV has a 
tropism. The construction and packaging of herpes-based vectors are well known to those with 
5 ordinary skill in the art. A replication-conipetent herpes simplex virus (HSV) ^^pe 1-based vector has 
been used to deliver a reporter gene to the eyes of primates (liu, X. et al. (1999) Exp. Eye Res. 
169:385-395).. The construction of a HSV-l virus vector has also been disclosed in detail in U.S. 
Patent No. 5,804,413 to DeLuca CTHerpes simplex virus strains for gene transfer'')* which is hereby 
incorporated by reference. U.S. Patent No. 5,804,413 teaches the use of recombinant HSV d92 which 
10 consists of a genome containing at least one exogenous gene to be transferred to a cell under the 
control of the appropriate promoter for purposes including human gene therapy. Also taught by this 
patent ate the construction and use of recombinant HSV strains deleted for ICP4, ICP27 and ICP22. 
For HSV vectors, see also Coins, WJ. et al. (1999; J. Virol. 73:519-532) and Xu. H. et aL (1994; 
Dev. Biol. 163:152-161). The manipulation of cloned herpesvirus sequences, the generation of 
15 recombinant virus following the tiansfection of multiple plasmids contaming different segments of 
the large herpesvirus genomes, the growth and propagation of herpesvirus, and the infection of cells 
with herpesvirus are techniques well known to Ifaose of ordinary skill in the art 

In another embodiment, an alphavirus (positive, single-stranded RNA virus) vector is used to 
deliver polynucleotides encoding KPP to target cells. The biology of the prototypic alphavirus, 
20 Semliki Forest Virus (SFV), has been studied extensively and gene transfer vectors have been based 
on the SFV genome (Garoff, H. and K.-J. Li (1998) Curr. Opin. Biotechnol. 9.464-469). During 
alphavirus RNA replication, a subgenomic RNA is generated that normally encodes the viral capsid 
proteins. This subgenonuc RNA replicates to higher levels than the full length genomic RNA, 
resulting in the overproduction of capsid proteins relative to the viral proteins with enzymatic activity 
25 (e.g., protease and polymerase). Similarly, inserting the coding sequence for KPP into the alphavirus 
genome in place of the capsid-coding region results, in the production of a large number of KPP- 
coding RNAs and the synthesis of high levels of KPP in vector transduced cells. While alphavjrus 
infection is typically associated with ceinysiswitfiSh a fewdays, the ability to establish a persistent 
infection in hamster normal kidney cells (BHK-21) with a variant of Sindbis virus (SIN) indicates that 
30 the lytic replication of alphaviruses can be altered to suit the needs of the gene therapy application 
(Ehyga, S.A. et al. (1997) Virology 228:74-83). The wide host range of alphaviruses will allow the 
introduction of KPP into a variety of cell types. The specific transduction of a subset of cells in a. 
population may require the sorting of cells prior to transduction. The methods of manipulating 
infectious cDNA clones of alphaviruses, performing alphavirus cDNA and RNA transfections, and 
35 performing alphavirus infections, are well known to those with ordinary skill in the art 
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Oligonucleotides derived from the transcription initiation site, e.g., between about positions 
-10 and +10 from the start site, may also be employed to inhibit gene expression. Similarly, 
inhibition can be achieved using triple helix base-pairing methodology. Triple helix pairing is useful 
because it causes inhibition of the ability of the double helix to open sufficiently for the binding of 

5 polymerases, transcription factors, or regulatory molecules. Recent therapeutic advances using 

triplex DNA have been described in die literature (Gee, I.E. et al. (1994) in Ruber, B.E. and B.L Carr, 
Molecular and Immunolo^c Approaches. Ritura Publishing, Mt. Kisco NY, pp. 163-177). A 
complementary sequence or antisense molecule may also be designed to block translation of mRNA 
by preventing the transcript from binding to ribosomes. 

10 Ribozymes, enzymatic RNA molecules, may also be used to catalyze the specific cleavage of 

RNA. The mechanism of ribozyme action involves sequence-specific hybricUzation of the ribozyme 
molecule to coni^Iementary target RNA, follov^ed by endonucleolytic cleavage. For example, 
engineered hammerhead motif ribozyme molecules may specifically and efficiently catalyze 
endonucleolytic cleavage of RNA molecules encoding KPP. 

15 Specific ribozyme cleavage sites v^ithin any potential RNA target are initially identified by 

scanning the target nsolecule for ribozyme cleavage sites, including the following sequences: GUA, 
GUU, and GUC. Once identified, short RNA sequences of between 15 and 20 ribonucleotides, 
corresponding to the region of the target gene containing the cleavage site, may be evaluated for 
secondary stractural features which may render the oligonucleotide inoperable. The suitability of 

20 candidate targets may also be evaluated by testing accessibility to hybridization with complementary 
oligonucleotides using ribonuclease protection assays. 

Complementary ribonucleic acid molecules and ribozymes may be prepared by any method 
known in the art for the synthesis of nucleic acid molecules. These include techniques for chemically 
synthesizing oligonucleotides such as solid phase phosphoramidite chemical synthesis. Alternatively, 

25 RNA molecules may be generated by in vitro and in vivo transcription of DNA molecules encoding 
KPP. Such PNA sequences may be incorporated into a wide variety of vectors with suitable RNA 
polymerase promoters such as T7 or SP6. Alternatively, these cDNA constructs that synthesize 

r- -complementary RNA, constitutively or indiicibly, can be introduced into cell lines, cells, or tissues. 

RNA molecules may be modified to increase intracellular stability and half-life. Possible 

30 modifications include, but are not limited to, the addition of flanking sequences at the 5' and/or 3* 
ends of the molecule, or the use of phosphorothioate or 2' O-methyl rather than phosphodiesterase 
linkages within the backbone of the molecule. This concept is inherent in the production of PNAs 
and can be extended in all of these molecules by the inclusion of nontraditional bases such as inosine, 
queosine, and wybutosine, as well as acetyl-, methyl-, thio-, and similarly modified forms of adenine, • 

35 cytidine, guanine, thymine, and uridine which are not as easily recognized by endogenous 
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In other embodiments of the invention, the expression of one or more selected 
polynucleotides of the present invention can be altered, inhibited, decreased, or silenced using RNA . 
interference (RNAi) or post-transcripdonal gene silencing (PTGS) methods known in the art RNAi 
5 is a post-transcriptional mode of gene silencing in which double-stranded RNA (dsRNA) introduced 
into a targeted cell specifically suppresses the expression of the homologous gene (i.e.. the gene . 
bearing the sequence complementary to.the dsRNA). This effectively knocks out or substantially 
reduces the expression of the targeted gene. PTGS can also be accomplished by use of DNA or DNA 
fragments as well. RNAi methods are described by Fire. A. et al. (1998; Nature 391:806-811) and 
10 Gura. T. (2000; Nature 404:804-808). PTGS can also be initiated by introduction of a 

complementary segment of DNA into the selected tissue using gene deUvery and/or viral vector 
delivery methods described herein or known in the art 

RNAi can be induced in mammalian cells by the use of small interfering RNA also known as 
sIRNA. siRNA are shorter segments of dsRNA (typically about 21 to 23 nucleotides in length) that 
15 result in v/v*, from cleavage of introduced dsRNA by the action of an endogenous ribonuclease. 
siRNA appear to be the mediators of the RNAi effect in mammals. The most effective siRNAs 
appear to be 21 nucleotide dsRNAs with 2 nucleotide 3' overhangs. The use of siRNA for inducing 
RNAi in manunalian cells is described by Elbashir, S.M. et al. (2001; Nature 411:494^98). 
SiRNA can be generated indirecUy by introduction of dsRNA into the targeted cell. 
20 Alternatively. siRNA can be synthesized direcUy and introduced into a cell by transfection methods 
and agents described herein or known in the art (such as liposome-mediated transfection, viral vector 
methods, or other polynucleotide delivery/introductory methods). Suitable siRNAs can be selected 
by examining a transcript of the target polynucleotide (e.g., mRNA) for nucleotide sequences • 
downstream from the AUG start codon and recording the occurrence of each nucleotide and the 3* 
adjacent 19 to 23 nucleotides as potential siRNA target sites, with sequences having a 21 nucleotide 
length being preferred. Regions to be avoided for target siRNA sites include the 5' and 3'untraiislated 
regions (UTRs) and regions near the start codon (within 75 bases), as these may be richer in 
regulatonr protein binding sites. Um-binding proteins and/i«l,^stetion inlti ' 
interfere with binding of the siRNP endonuclease complex. The selected taiget sites for sIRNA cap 
then be compared to the appropriate genome database (e.g:, human, etc.) using BLAST or other 
sequence comparison algorithms known in the art. Taiget sequences with significant homology to 
other coding sequences can be eliminated from consideration. The selected siRNAs can be.produced " . . 
by chemical synthesis methods known in the art or by in vitro transcription using commercially 
available methods and kits such as tiie S1LENC3BR siRNA construction kit (Ambion. Austin T3Q. 

In alternative embodiments, long-term gene silencing and/or RNAi effects can be induced in 
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selected tissue using expression vectors that continuously express siRNA. This can be accomplished 
using expression vectors that are engineered to express hairpin RNAs (shRNAs) using methods 
known in the art (see, e.g., Brummelkamp, T.R. ct al. (2002) Science 296:550-553; and Paddison, PJ. 
et al. (2002) Genes Dev. 16:948-958). In these and related embodiments, shRNAs can be delivered to 
5 target cells using expression vectors known in the art. An exaiiq>le of a suitable expression vector for 
delivery of siRNA is the PSnJSNCER1.0-U6 (circular) plasmid (Ambion). Once delivered to the 
target tissue, shRNAs are processed in vivo into siRNA-like molecules capable of carrying out gene- 
specific silencing. 

In various embodiments, the expression levels of genes targeted by RNAi or PTGS methods 
10 can be determined by assays for mRNA and/or protein analysis. Expression levels of the mRNA of a 
targeted gene can be determined, for example, by northern analysis methods using the 
NORTHERNMAX-GLY kit (Ambion); by microartay methods; by PCaEt methods; by real time PGR 
mediods; and by other RNA/polynucleotide assays known in the art or described herein. Expression 
levels of the protein encoded by the targeted gene can be determined, for example, by microarray 
IS methods; by polyacrylamide gel electr<^horesis; and by Western analysis using standard techniques 
known in the art 

An additional embodiment of the invention encompasses a method for screening for a 
compound which is effective in altering expression of a polynucleotide encoding KPP. Compounds 
which may be effective in altering expression of a specific polynucleotide may include, but are not 

20 limited to, oligonucleotides, antisense oligonucleotides, triple helix-forming oligonucleotides, 
transcription fectors and other polypeptide transcriptional regulators, and non-macromolecular 
chemical entities which are capable, of interacting with specific polynucleotide sequences. Effective 
compounds may alter polynucleotide expression by acting as either inhibitors or promoters of 
polynucleotide ex]pression. Thus, in the treatment of disorders associated with increased KPP 

25 expression or activity, a compound which specifically inhibits expression of the polynucleotide 
encoding KPP may be therapeutically useful, and in the treatment of disorders associated with 
decreased KPP expression or activity, a compound which specifically promotes wpression of the _ 

- - polynucleotide encoding KPP may be therapeutically useful. 

In various embodiments, one or niore test compounds may be screened for effectiveness in 

30 altering expression of a specific polynucleotide. A test compound may be obtained by any method 
commonly known in the art, including chemical modification of a compound known to be effective in 
altering polynucleotide expression; selection from an existing, commercially-available or proprietary- 
library of naturally-occurring or non-natural chemical compounds; rational design of a compound 
based on chemical and/or structural properties of the target polynucleotide; and selection from a 

35 library of chemical compounds created combinatorially or randomly. A sample comprising a 
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polynucleotide encoding KPP is exposed to at least one test compound thus obtained. The sample 
may comprise, for example, an intect or permeabilized cell, or an m Wm, cell-ftee or reconstituted 
biochemical system. Alterations in the expression of a polynucleotide encoding KPP are assayed by 
any method commonly known in the art Typically, the expression of a specific nucleotide is 
5 detected by hybridization with a probe having a nucleotide sequence complementary to the sequence 
of thepoIynucIeotideencodingKPP. Theamountof hybridization may be quantified, thus forming 
the basis for a comparison of the expression of the polynucleotide both with and without exposure to 
one or mote test compounds. Detection of a change in the expression of a polynucleotide exposed to 
a test compound indicates that the test compound is effective in altering the expression of the 
10 polynucleotide. A screen for a compound effective iii altering expression of a specific polynucleotide 
can be carried out. for example, using a Schizosaccharomyces pombe gene expression system 
(Atkins. D. et al. (1999) U.S. Patent No. 5.932.435; Amdt. GM. et aL (2000) Nucleic Acids Res. 
28:E15) or a human cell line such as HeLa cell (Qarke, M.L. et al. (2000) Biochem. Biophys. Res. 
- Commun. 268:8-13). A particular embodiment of the present invention involves screening a 
15 combinatorial library of oligonucleotides (such as deoxyribonucleotides. ribonucleotides, peptide 
nucleic acids, and modified oligonucleotides) for antisense activity against a specific polynucleotide 
sequence (Bruice. T.W. et al. (1997) U.S. Patent No. 5.686.242; Bruice. T.W. et al. (2000) U.S. 
Patent No. 6,022.691). 

Many methods for introducing vectors into cells or tissues are available and equally suitable 
20 fornscmvivo,invitro,mdexvivo. For ex viVo therapy, vectors may be introduced into stem cells 
taken fix>m the patient and clonally propagated for autologous transplant back into tfiat same patient. 
Delivery by transfection. by liposome injections, or by polycationic amino polymers may be achieved 
using methods which are well known in the art (Goldman. C.K. et al. (1997) Nat Biotechnol 15-462- 
466). 

25 Any of the therapeutic inethods described above may be applied to any subject in need of 

such therapfy, including, for example, mammals such as humans, dogs. cats. cows, horses, rabbits, and 
monkeys. 

An additional embodiment of tiie inventionieiates to the administration of a composition 

which generally comprises an active ingredient foimulatedwitii a phkrmaceutically acceptable 
30 exciplent. Exciplents may include, for example, sugars, starches, celluloses, gums, and proteins. 
Various formulations are commonly known and are thoroughly discussed in the latest edition of 
gehmiRton^ Pharmaceutical Sc\m (Maack Publishing. Easton PA). Such compositions may 
consist of KPP, antibodies to KPP, and mimetics, agonists, antagonists, or inhibiton: of KPP. 
In various embodiments, the compositions described herein, such as pharmaceutical 
35 compositions, may be administered by any number of routes including, but not limited to, oral. 
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intravenous, intramuscular, intra-^rterial, intramedullary, intrathecal, intraventricular, pulmonary, 
transdermal, subcutaneous, intraperitoneal, intranasal, enteral, topical, sublingual, or rectal means. 

Compositions for pulmonary administration may be prepared in liquid or dry powder fonqu 
These compositions are generally aerosolized immediately prior to inhalation by the patient. In the 
5 case of small molecules (e.g. traditional low molecular weight organic drugs), aerosol delivery of 
^ fast-acting formulations is well-known in the art In the case of macromolecules (e.g. larger peptides 
and proteins), recent developm^jits in the field of pulmonary delivery via the alveolar region of the 
lung have enabled the p/ractical delivery of drugs such as insulin to blood circulation (see, e.g., Patton, 
J.S. et al., U.S. Patent No. 5,997,848). Pulmonary delivery allows administration without needle 
10 injection, and obviates the need for potentially toxic penetmtion enhancers. 

Compositions suitable for use in the invention include con^ositions, wherein the active 
ingredients are contained in an efTective amount to achieve the intended purpose. The determination 
of an effective dose is well within the capability of those skilled in tiie art. 

Specialized forms of compositions may be prepared for direct intracellular delivery of 
15 . macromolecules comprising KPP or fragments thereof. For example, liposome preparations .. 
. containing a cell-impermeable macromolecule may promote cell fusion and intracellular delivery of 
the macromolecule. Alternatively, KPP or a fragment thereof may be joined to a short cationic N- 
terminal portion ftom the HIV Tat-1 protein. Fusion proteins thus generated have been found to 
transduce into the cells of all tissues, including the brain, in a mouse model system (Schwarze, S.R. et 
20 al. (1999) Science 285: 1569-1572). 

For any conqKMind, the therapeutically effective dose can be estimated initially either in cell 
culture assays, e.g., of neoplastic cells, or in animal models such as mice, rats, rabbits, dogs, 
monkeys, or pigs. An animal model may also be used to determine the appropriate concentration 
range and route of administration. Such information can then be used to determine useful doses and 
25 routes for administration in humans. 

A therapeutically effective dose refers to that amount of active ingredient, for example KPP 

or fragments thereof, antibodies of KPP, and agonists, antagonists or inhibitors of KPP, which 

^ ameliorates the symptoms or condition. ThcrapeuBc efficacy and toxicity may be determined by 
standard pljarmaceutical procedures in cell cultures or with experimental animals, such as by . 
30 . calculating the ED50 (the dose therapeutically effective in 50% of the population) or LD50 (the dose 

lethal to 50% of the population) statistics. The dose ratio of toxic to therapeutic effects is the 
. . therapeutic index, which. can be expressed as the LD50/ED50 ratio. .Con[q)0sitions. which exhibit large 
therapeutic indices are preferred. The data obtained from cell culture assays and animal studies are 
used to formulate a range of dosage for human usel The dosage contained in such compositions is 
35 preferably within a rang^ of circulating concentrations that includes the EDjo with little or no toxicity. 
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The dosage varies within this laiige depending upon the dosage fomi employed, die sensitivity of the 
patient, and the route of administration. 

The exact dosage wiU be determined by the practitioner, in light of factors related to ttie 
subject requiring treatment Dosage and administration are adjusted to provide sufficient levels of the 
5. active moiety or to maintain the desired effect. Factors which may be taken into account include die 
severity of the disease state, the general healdi of tie subject, the age, weight, and gender of tfie 
subject, time and frequency of administration, drug combination(s). reaction sensitivities, and 
response to tiierapy. Long-acting compositions may be administered every 3 to 4 days, every week, 
or biweekly depending on tfie half-life and clearance rate of tiie particular formulation. 

10 Nonnal dosage ainounts may vary from about 0. 1 f^g to 100.000 ng, up to a total dose of 

about 1 gram, depending upon tiie route of administration. Guidance as to particular dosages and 
methods of delivery is provided in the Uterature and generaUy available to practitioners in tfie art 
Those skilled in tiie art will employ different formulations for nucleotides than for proteins or tfieir 
inhibitors. Similarly, delivery of polynucleotides or polypeptides will be specific to particular ceUs. 

15 conditions, locations, etc. 
DIAGNOSTICS 

Jn anotfier embodiment, antibodies which specifically bind KPP may be used for tfie 
diagnosis ofdisordeis characterized by expression of KPP, or in assays to monitor patients being 
treated with KPP or agonists, antagonists, or inhibitors of KPP. Antibodies useful for diagnostic 
purposes may be prepared in tfie same manner as described above for therapeutics. Diagnostic assays 
for KPP include metfiods which utilize tfie antibody and a label to detect KPP in human body fluids 
or m extracts of cells or tissues. The antibodies may be used witfi or witfiout modification, and may 
be labeled by covalent or non-covalent attachment of a reporter molecule. A wide variety of reporter 
molecules, several.of which are described above, are known in Uie art and may be used. 
25 A variety of protocols for measuring KPP, including EUSAs. RIAs. and FACS. are known in 

tfie art and provide a basis for diagnosing altered or abnormal levels of KPP expression. Normal or 
standard valuies for KPP expression are established by combining body fluids or cell extracts taken 
. fiom normal mammalian subjects, for exampVhuman OTbjwts,"with antibodies to ^der"^ ' 
conditions suitable for complex formation. The amount of standard complex formation may be 
30 quantitated by various metfiods. such as photometric means. Quantities of KPP expressed in subject, 
control, and disease samples fix>m biopsied tissues are compared witfi tfie standard values. Deviation 
between standard and subject value&establishes tiie parameters for diagnosing disease. 

In anotfier embodiment of tfie invention, polynucleotides encoding KPP may be used for 
diagnostic purposes. The polynucleotides which may be used include oligonucleotides. 
35 complementary RNA and DNA molecules, and PNAs. The polynucleotides may be used to detect 
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and quantify gene expression in biopsied tissues in which expression of KPP may be conelated witfi 
disease. The diagnostic assay may be used to detennine absence, presence, and excess expression of 
KPP, and to monitor regulation of KPP levels during therapeutic intervention. 

In one aspect..hybridization with PCR probes which are capable of detecting polynucleotides, 
including genomic sequences, encoding KPP or closely related molecules may be used to identify 
nucleic acid sequences which encode KPP. The specificity of the probe, whether it is made from a 
highly specific region, e.g.. the S'regulatoiy region, or from a less specific region, e.g, a conserved 
motif, and the stringency of the hybridization or ampUfication will determine whether tfie probe 
identifies only naturally occurring sequences encoding KPP, allelic variants, or related sequences. 

Ptobes may also be used for the detection of related sequences, and may have at least 50% 
sequence identity to any of tiie KPP encoding sequences. The hybridization probes of the subject 
invention may be DNA or RNA and may be derived from die sequence of SEQ ID NO:6-10 or from 
genomic sequences including promoters, enhancers, and mtrons of the KPP gene. 

Means for producing specific hybridization probes for polynucleotides encoding KPP mclude 
the cloning of polynucleotides encoding KPP or KPP derivatives into vectors for the production of 
mRNA probes. Such vectors are known in the art, are commercially available, and may be used to 
synthesize RNA probes in vitro by means of the addition of die appropriate RNA polymerases and die 
appropriate labeled nucleotides. Hybridization probes may be labeled by a variety of reporter groups, 
for example, by radionuclides such as ^P or «S. or by enzymatic labels, such as alkaline phosphatase 
coupled.to die probe. via avidin/biotin coupling systems, and the like. 

Polynucleotides encoduig KPP may be used for the diagnosis of disorders associated with 
expression of KPP. Examples of such disorders include, but are not limited to. a caidiovascular 
disease such as arteriovenous fistula, atiierosclerosis, hypertension, vasculitis, Raynaud's disease, 
aneurysms, arterial dissections, varicose veins, tfirombophlebitis and phlebothrombosis. vascular . 
tumors, and compHcations of thrombolysis, baUoon angioplasty, vascular replacement, and coronary 
artery bypass graft surgery, congestive heart faUure. ischemic heart disease, angina pectoris, 
myocardial infarction, hypertensive heart disease, degenerative valvular heart disease, calcific aortic 
valve stenosis, congenitally bicuspid aortic valve, mitral annular ^lcification,™~tr^v^vrprolaps7.'' ' 
riieumatic fever and riieumatic heart disease, infective endocarditis, nonbacterial duombotic 
endocarditis, endocarditis of systemic lupus erytfiematosus. carcinoid heart disease, cardiomyopathy, 
myocarditis, pericarditis, neoplastic heart disease, congenital heart disease, aiid complications of 

cardiac transplantation, congenital lung anomalies, atelectasis, pubnonaiy congestion and edema. 

pulmonary embolism, pulmonary hemorrhage, pulmonary Infarction, pulmonary hypertension, 
vascular sclerosis, obstructive pulmonary disease, restrictive pulmonary disease, chronic obstructive 
pulmonary dusease, emphysema, chronic bronchitis, bronchial asthma, bronchiectasis, bacterial 
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pneumonia, viral and mycoplasmal pneumonia, lung abscess, pulmonary tuberculosis, diffuse 
interstitial diseases, pneumoconioses, sarcoidosis, idiopathic pulmonary fibrosis, desquamative 
interstitial pneumonitis. hypersensitivity pneumonitis, pulmonary eosinophiUa bronchiolitis 
obUterans-organizing pneumonia, diffuse pulmonary hemorrhage syndromes, Goodpasture^ 
5 syndromes, idiopathic pulmonary hemosiderosis, pulmonary involvement in collagen-vascular 
disorders, pulmonary alveolar proteinosis, lung tumors, inflammatory and noninflammatory pleural 
effusions, pneumothorax, pleural tumors, drug-induced lung disease, radiation-induced lung disease, 
and complications of lung transplantation; an immune system disorder such as acquired 
immunodeficiency syndrome (AIDS), Addison's disease, adult respiratory distress syndrome. 
10 aUergies. ankylosing spondylitis, amyloidosis, anemia, astimia. atherosclerosis, autoimmune 

hemolytic anemia, autoimmune tiiyroiditis, autoimmune polyendocrinopathy-candidiasis-ectoderraal 
dystrophy (APECED). bronchitis, cholecystitis, contact dennatitis. Crohn's disease, atopic dermatitis, 
dennatomyositis. diabetes meUitus, emphysema, episodic lymphopenia witi, lymphocytotoxms. 
erythroblastosis fetalis, erytiiema nodosum, atrophic gastritis, glomerulonephritis, Goodpasture's 
15 syndrome, gout. Graves' disease, Hashimoto's tiiyroiditis. hypereosinophilia. irritable bowel 
syndrome, multiple sclerosis, myastiienia gravis, myocardial or pericardial inflammation, 
osteoarthritis, osteoporosis, pancreatitis, polymyositis, psoriasis. Reiter's syndrome, rheumatoid 
arthritis, scleroderma, Sjogren's syndrome, systemic anaphylaxis, systemic lupus eiytiiematosus. 
systemic sclerosis, thrombocytopenic purpura, ulcerative colitis, uveitis, Werner syndrome, 
20 compUcations of cancer, hemodialysis, and extracorporeal circulation, viral, bacterial, fim^l, 

pamitic, protozoal, and helminthic infections, and trauma; a neurological disorder such as epilepsy, 
ischemic cerebrovascular disease, stroke, cerebral neoplasms, Alzheimer's disease. Pick's disease, 
Huntington's disease, dementia, Paridnson's disease and other extrapyramidal disorders, amyotrophic 
. sclerosis and otiiermotorneuron disorders, progressive neural muscular atrophy, retinitis 

25 Pig^ntosa. hereditary ataxias, multiple sclerosis and otiierdemyeluiating diseases, bacteri^^ 
viral meningitis, brain abscess, subdural empyema, epidural abscess, suppurative intracranial 

thrombophlebitis, myelitis and radiculitis, viral central nervous system disease, prion diseases • 

- including kuru.Qeutzfeldt-Jakob disease, and Ger^bnann-Straus 

femiUal insomnia, nutritional and metabolic diseases' of the nervous system, neurofibromatosis. ' • 
30 tuberoussclerosis. cerebelloretinal hemangioblastomatosis, encephalotrigeminal syndrome, mental 
retardation and other developmental disorders of the central nervous system including Down 
syndrome, cerebral palsy, neuroskeletal disorders, autonomic nervous system disoiders. cranial nerve 
disorders, spinal cord diseases, muscular dystrophy and otiier neuromuscular disorders, peripheral 
nervous system disorders, dermatomyositis and polymyositis, inherited, metabolic, endocrine, and 
35 toxic myopatiiies, myastfienia gravis, periodic paralysis, mental disordera including mood, anxiety. 
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and schizophrenic disoiders, seasonal affective disorder (SAD), akathesia, amnesia, catatonia, 
diabetic neuropathy, tardive dyskinesia, dystonias, paranoid psychoses, postherpetic neuralgia, 
Tourette's disorder, progressive supranuclear palsy, corttcobasal degeneration, and familial 
frontotemporal dementia; a disorder affecting growth and development such as actinic keratosis, 
5 arteriosclerosis, atherosclerosis, bursitis, cirrhosis, hepatitis, mixed coimective tissue disease 

(MCTD), myelofibrosis, paroxysmal nocturnal hemoglobinuria, polycythemia vera, psoriasis, primary 
thrombocythemia, renal tubular acidosis, anemia. Gushing* s syndrome, achondroplastic dwarfism. 
Duchenne and Becker muscular dystrophy, epilepsy, gonadal dysgenesis, WAGR syndrome (Wilms' 
tumor, aniridia, genitourinary abnormalities, and mental retardation), Smith-Magenis syndrome, 
10 myelodysplastic syndrome, hereditary mucoepithelial dysplasia, hereditary keratodermas, hereditary 
. neuropathies such as .Charcot-Marie-Tooth disease and neurofibromatosis, hypothyroidism, 
hydrocephalus, seizure disorders such as Syndenham's chorea and cerebral palsy, spina bifida, 
anencephaly, craniorachischisis, congenital glaucoma, cataract, and sensorineural hearing loss; a lipid 
disorder such as fatty liver, cholestasis, primary biliary cirrhosis, carnitine deficiency, carnitine 
15 palmitoyltransferase deficiency, myoadenylate deaminase deficiency, hypertriglyceridemia, lipid 
storage disorders such Fabry's disease, Gaucher*s disease, Niemann-Kck's disease, metachromatic 
leukodystrophy, adrenoleukodystrophy, GM^ gangliosidosis, and ceroid lipofuscinosis, 
abetaliiK>proteinemia, Tangier disease, hyperlipoproteinemia, diabetes mellitus, lipodystrophy, 
lipomatoses, acute panniculitis, disseminated fat necrosis, adiposis dolorosa, lipoid adrenal 
20 hyperplasia, minimal change disease, lipomas, atherosclerosis, hypercholesterolemia, 

hypercholesterolemia with hypertriglyceridemia, priniaiy hypoalphalipoproteinemia, hypothyroidism, 
renal disease, liver disease, lecithintcholesterol acyltransferase deficiency, cerebrotendinous 
xanthomatosis, sitosterolemia, hypocholesterolemia, Tay-Sachs disease, SandhofiTs disease, 
hyperlipidemia, hyperiipemia, lipid myopathies, and obesity; and a cell proliferative disorder such as 
25 . actinic keratosis, arteriosclerosis, atherosclerosis, bursitis, cinhosis, hepatitis, mixed connective 
• dssue disease (MCTD), myelofibrosis, paroxysmal nocturnal hemoglobinuria, polycythemia vera, 
psoriasis, primary thrombocythemia, and cancers including adenocarcinoma, leukemia, lymphqma^_ 
- • melanoma, myeloma, sarcoma, teratocarcinoma, and, in particular, cancers of the adrenal gland, 

bladder, bone, bone marrow, brain, breast, cervix, colon, gall bladder, ganglia, gastrointestinal tract, 
30 heart, kidney, liver,, lung, muscle, ovary, pancreas, parathyroid, penis, prostate, salivary glands, skin, • 
spleen, testis, thymus, thyroid, uterus, leukemias such as multiple myeloma, and lymphomas such as 
. . Hodgkin's disease. Polynucleotides encoding KPP may be used in Southern or northern analysis, dot 
blot, or other membrane-based technologies; in PGR technologies; in dipstick, pin, and multiformat 
ELISA-like assays; and in microarrays utilizing fluids or tissues from patients to detect altered KPP 
35 expression. Such qualitative or quantitative methods are well known in the art. 
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In a particular embodiment, polynucleotides encoding KPP may be used in assays that detect 
the presence of associated disorders, particularly those mentioned above. Polynucleotides 
complementaiy to sequences encoding KPP may be labeled by standanl methods and added to a fluid 
or tissue sample ftom a patient under conditions suitable for the formation of hybridization 
5 complexes. After a suitabte incubation period, the sample is washed and the signal is quantified and 
compared with a standard value. If the amount of signal in the patient sample is significantly altered 
in comparison to a control sample then the presence of altered levels of polynucleotides encoding 
KPP in the sample indicates die presence of the associated disorder. Such assays may also be iised to 
evaluate the efficacy of a particular tiierapeutic treatment regimen in animal studies, in clinical trials. 
10 or to monitor the treatment of an individuai patient. 

In order to provide a basis for die diagnosis of a disonler associated wiUi expression of KPP, 
a normal or standard profflc for expression is established. This may be accomplished by combining 
body fluids or cell extracts taken ftom normal subjects, either animal or human, witii a sequence, or a 
ftagment dtereof, encoding KPP. under conditions suitable for hybridization or amplification. 
Standard hybridization may be quantified by comparing the values obtained fixan normal subjects 
with values from an experiment m which a known amount of a substantiaUy purified polynucleotide • 
is used. Standard values obtained in tiiis manner may be compared witfi values obtained from 
samples from patients who are symptomatic for a disorder. Deviation ftom standaid values is.used to 
establish the presence of a disorder. ^ 

Once the presence of a disorder is established and a treatment protocol is initiated, 
hybridization assays may be repeated on a regular basis to determine if the level of expression in tiie , 
patient begins to approximate that which is observed in tiie normal subject. The results obtained from 
successive assays may be used to show the efficacy of treatment over a period rangmg from several 
days to months. 

With respect to cancer, the presence of an abnormal amount of transcript (either under- or • 
overexpressed) in biopsied tissue from an individual may indicate a predisposition for tiie 
development of tiie disease, or may provide a means for detecting the disease priorjo tiie appearance. - 
of actual clinical symptoms. A more definitive diagnosis of this type may allow healdi professionals 
to employ preventative measures or aggressive treatment earlier, tiiereby preventing tiie development 
30 or further progression of the cancer. 

Additional diagnostic uses for oligonucleotides designed from the sequences encoding KPP 

may involve the use of PGR. These oligomers may be chemically synthesized, generated 

enzymatically, or produced in vitro. OUgomeis will preferably contain a fragment of a polynucleotide 
encoding KPP. or a Augment of a polynucleotide complementary to tiie polynucleotide encoding 
KPP. and will be employed under optimized conditions for identification of a specific gene or 
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condition. Oligomers may also be employed under less stringent conditions for detection or 
quantification of closely rel^ DNA or RNA sequences. 

hi a particular aspect, oligonucleotide primers derived fix>m polynucleotides encoding KPP 
may be used to detect single nucleotide polymorphisms (SNPs). SNPs are substitutions, insertions 
5 anddeletionstiiatansaftequentcauseofinheritedoracquiiedgeneticdiseaseinhumans. MeUiods 
of SNP detection include, but are not limited to. single-stranded conformation polymorphism (SSCP) 
and fluorescent SSCP (fSSCP) methods. In SSCP. oligonucleotide primers derived fiom 
polynucleotides encoding KPP are used to amplify DNA using the polymerase cham reaction (PGR). 
The DNA may be derived, for example, fiom diseased or normal tissue, biopsy samples, bodily fluids. 
10 andthelike. SNPsintiieDNAcausedifferencesintfiesecondaryandtertiarystructuresofPCR 
products in single-stranded form, and diese differences ate detectable using gel electrophoresis in 
non-denaturing gels. In fSCCP, the oUgonucleotide primers arc fluorescentiy labeled, which allows 
detection of tiie amplimers in high-throughput equipment such as DNA sequencing machines. 
Additionally, sequence database analysis mefliods. termed in silico SNP (isSNP). are capable of 
15 identifying polymorphisms by comparing die sequence of individual overiapping DNA fragments 
which assemble into a common consensus sequence. These computer-based metiiods filter out 
sequence variations due to laboratory preparation of DNA and sequencing errors using statistical 
models and automated analyses of DNA sequence chix,matograms. In the alternative. SNPs may be 
detected and characterized by mass spectrometry using, for example, the high throughput 
20 MASSARRAY system (Sequenom, lac., San Diego CA). 

SNPs may be used to study the genetic basis of human disease. For example, at least 16 
common SNPs have been associated with non-insulin-dependent diabetes mellitus. SNPs are also 
useful for examining differences in disease outcomes in monogenic disorders, such as cystic fibrosis, 
sickle cell anemia, or chronic granulomatous disease. For example, variants in the mannose-binding 
25 lectin, MBL2, have.been shown to be correlated with deleterious pulmonary outcomes in cystic 
fibrosis. SNPs also have utility in pharmacogenomics. the identification of genetic variants Uiat 
influence apatienfs response to a drug, such as life-ti>reatening toxicity. For examjle^ a.vadationin . 
N-acetyl transferase is associated witii a high incidence of peripheral neuropatfiy in response to Uie 
anti-tuberculosis drug isoniazid, while a variation in tfie core promoter of the ALOX5 gene results in 
30 diminished clinical response to treatment wiUi an anti-astiima drug that taigets Uie 5-lipoxygenase 
patfiway. Analysis of the distribution of SNPs in different populations is useful for investigating 
genetic drift, mutation, recombination, and selection, as well as for tracing the origins of populations - 
and tiieir migiations (Taylor, J.O. et ai. (2001) Trends Mol. Med. 7:507-512; Kwok, P.-Y. and Z. Gu 
(1999) Mol. Med. Today 5:538-543; Nowotny. P. et al. (200J) Curr. Opin. Neurobiol. 1 1:637-641). 
Methods which may also be used to quantify die expression of KPP include radiolabeling or 
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blotinylating nucleotides, coamplification of a control n«cl«c acid, and interpolating results from 
standard curves (Melby. P.C. et al. (1993) J. Ihmmnol. Methods 159:235-244; Duplaa. C e. al (1993) 
Anal. Biochen. 212:229-236). The speed of quantitation of multiple samples may be accelerated by 
runmng the assay in a high^hroughput format where ti« oligomer or polynucleotide of interest is. 
5 presented in vario»; dilutions and a spectrophotometric or colorimetric response gives rapid 
quantitation. 

In further embodinaents. oKgonucIeotides or longer fragments derived from any of ti,e 
polynucleotides described herdn may be used as elements on a microarray. The microanay can be 
used m transcript imaging techniques which monitor the relative expression levels of large numbers 
) ofgenessimultaneouslyasdescribedbelow. The microanay may also be used to identify genetic 
vanants. Mutations, and polymorphisms. Tins information may be used to deiermhie gene function 
to understand the genetic basis of a disorder, to diagnose a disorder, to monitor 
progression/regression of disease as a Action of gene expression, and to develop and monitor ti.e 
acuvuies of therapeutic agents m the treatment of disease. In particular, this information may be used 
to develop a pharmacogenomic profUe of apatient in order to select the most appropriate and 
effective treatment regimen for that patient For example, therapeutic agents which are highly 
effective and display the fewest side effects may be selected for a patient based on his/her 
pharmacogenomic profile. 

In anotiier embodiment. KPP. fragments of KPP. or antibodies specific for KPP may be used 
as elements on a microarray. The microarray may be used to monitor or measme protein-protein 
interactions, drug-target interactions, and gene expression profiles, as described above 

A particular embodiment relates to tiie use of ti.e polynucleotides of the present invention to 
generate a transcript image of a tissue or cell type. A transcript image repre^ts the global pattern of 
gene expression by a particular tissue or cell type. Global gene expression patterns are analyzed by 
quantifymg the number of expressed genes and their relative abundance under given conditions and at 
a given time (Seilhamer et al.. "Comparative Gene Transcript Analysis." U.S. Patent No. 5.840.484. 
hereby expressly incorpomted by reference herein). TT^us a transcript Image.may be^eneiated by 1 . - 
hybnd^ing die polynucleotides of the present invention or their complements to the totality of 
tianscnptsorreversetranscriptsofaparticulartissueorcelltype. In one embodiment, die . . 
hyhndization takes place in high-tiiroughput format, wherein tiie polynucleotides of the present 
invention or their complements comprise a subset of a plurality of elements on a microarray. The 
resultant transcript Image would provide a profile of gene activity. 

Ttanscript hnages may be generated using transcripts isolated from tissues, cell lines 
biopsies, or other biological samples. The tmnscript image may thus reflect gene expression 'in vivo 
as m die case of a tissue or biopsy sample, or in vHro, as in tiie case of a cell line 
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Transcript images which profile the expression of the polynucleotides of the present 
invention may also be used in conjunction with in vitro model systems and pieclinical evaluation of 
Phannaceuticals. as well as toxicological testing of industrial and naturally-occurring environmental 
compounds. All compounds induce characteristic gene expression patterns, fiequently termed 
5 molecular fingerprints or toxicant signatures, which are indicative of mechanisms of action and 
toxicity (Nuwaysir. EJF. et al. (1999) Mol. Carcinog. 24:153-159; Steiner. S. and N.L. Anderson 
(2000) Toxicol. Lett 112-113:467-471). If a test compound has a signature simUar to that of a 
compound witi, known toxicity, it is likely to share those toxic properties. These fingerprints or 
signatures are most useful and refined when they contain expression information ftom a large number 

10 ''fg-"- and gene families. Ideally, a genome-wide measurement of expression provides the higher 
quality signature. Even genes whose expression is not altered by any tested compomtds are important 
as well, as the levels of expression of tiiese genes are used to normalize the rest of tire expression 
data. T^^normalizationprocedureisusefWforcomparisonofexpressiondataaftertreatmentwiti, 
different compounds. While tixe assignment of gene fimction to elements of a toxicant signature aids 

15 m mte^jretation of toxicity mechanisms, knowledge of gene function is not necessary for tf« 

stat.st.calmatchingofsignatureswhichleadstopredictionoftoxicity(see.forexampIe.ftess - 
Release 00^2 fi^m the National Institute of Enviromnental Healtfr Sciences, released February 29 

2000. available at http://www.niehs.nih.gov/oc/news/toxchip.htm). Therefore, it is important and ' 
desnabte in toxicological screening using toxicant signatures to include all expressed gene sequences 
20 In an embodiment, tf.e toxicity of a test compound can be assessed by treating a biological 

sample contaming nucleic acids witi. tire test compound. Nucleic acids d^at are expressed in fl,e 
treated biological sample are hybridized with one or more probes specific to tiie polynucleotides of 
ti.e present invention, so that transcript levels coi^sponding to tire polynucleotides of tire present 
nrvention may be quantified. The transcript levels in tire treated biological sample are compared witi. 
■75 Icvelsin an untreated biological sample. Differences in ti,e transcript levels between tire two samples " 
are mdrcative of a toxic response caused by tire test compound in flie treated sample. 

• Anotf^r embodiment relates to tire use of tiie polypeptides disclosed herein to analyzathe. 
pioteome of a tissue or cell type. The term proteome refers to the global pattern of protein expression 
• mapartrculartissueorcelltype. Each protein component of a proteome can be subjected 
30 individually to flrrtfrer analysis. Proteome expression patterns, or profiles; are analyzed by 

qoantifyi,^ tire number of expressed proteins and their relative abundance under given conditions and 
atagiventime. Aprofilebf a cell's proteorireinaytirus be generated by separating and analyzing 
polypeptides of a particular tissue or cell type. In one embodiment, tiie separation is achieved using 
twoKlrmensional gel electrophoresis, in which proteins from a sample are separated by isoelectric 
35 focusmg in tire first dimension, and tfren according to molecular weight by sodium dodecyl sulfate 
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Slab gel electrophoresis in the second dimension (Steiner and Anderson, supra). The proteins are 
visuaUzed in the gel as discrete and uniquely positioned spots. typicaUy by staining the gel with an 
agent such as Coomassie Blue or silver or fluorescent stains. The optical density of each protein spot 
isgenerallyproportionaltotheleveloftheproteininthesample. The optical densities of ' 
equivalently positioned protein spots from different samples, for example, from biological samples 
either treated or untreated with a test compound or therapeutic agent, are compared to identify any 
changes in protein spot density related to the treatment The proteins in the spots are partiaUy 
sequenced using, for example, standard methods employing chemical or enzymatic cleavage followed 
by mass spectrometry. The identity of the protein in a spot may be determined by comparing its 
partial sequence, preferably of at least 5 contiguous amino acid residues, to the polypeptide sequences 
ofinterest. In some cases, further sequence data may be obtained for definitive protein identification. 

A proteomic profile may also be generated using antibodies specific for KPP to quantify the 
levels of KPP expression. In one embodiment, the antibodies are used as elements on a microarray. 
and protein expression levels are quantified by contacting the microarray with the sample and 
15 detecting the levels of protein bound to each anay element (Lueking. A. et al. (1999) Anal. Biochem. 
270:103-lll;Mendoze.L.G.etal.(1999)Biotechniques27:778-788). Detection may be performed • 
by a variety of methods known in the art. for example, by reacting the proteins in the sample with a 

thiol-oramino-reactivefluorescentcompoundanddetectingtheamountoffluorescencebbundat 
each array element. 

I Toxicant signatures at the proteome level are also useful for toxicological screenlr^g. and 

should be. analyzed in parallel with toxicant signatures at the transcript level. ITiere is apoo^ 
conelation between transcript and protein abundances for some proteins in some tissues (Anderson. 
NX. and J. Seilhamer (1997) Electrophoresis 18:533-537). so proteome toxicant signatures may be 
useful in the analysis of compounds which do not significantly affect the transcript image, but which 
25 alter the proteomi? profile. In addition, the analysis of transcripts in body fluids is difficult, due to 
rapid degradation of mRNA. so proteomic profiling may be more reliable and informative m such 
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In another embodiment, the toxicity of a test compound is assessed by treating a biological 
sample.contaming proteins with the test compound. Proteins that are expressed in the treated 

0 biological sample are separated so that the amount of each protein can be quantified. The amount of 
each protein is compared to the amount of the corresponding protein in an untreated biological 
sample. A difference in the amount of protein between the two samples is indicative of a toxic 
response to test compound in the treated sample. Individual proteins are identified by sequencing 
the amino acid residues of the individual proteins and comparing these partial sequences to the 

1 polypeptides of the present invention. 
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In another embodiment, the toxicity of a test compound is assessed by treating a biological 
sample contaming proteins with the test compound. Proteins fixHn the biological sample ate 
incubated with antibodies specific to the polypeptides of the present invention. The amount of 
protein recognized by the antibodies is quantified. The amount of protein in the treated biological 
sample is compared with the amount in an untreated biological sample. A difference m the amount of 
protein between the two samples is indicative of a toxic response to the test compound in the treated 
sample. * 

Microarrays may be prepared, used, and analyzed using methods known in the art (Brennan. 
T.M. et al. (1995) U.S. Patent No. 5.474,796; Schemi. M. et al. (1996) Proc. Nad. Acad. Sci. USA 
93: 10614-10619; Baldeschweiler et al. (1995) PCT application W095/251 16; Shalon. D. et al. (1995) 
PCT application WO95/35505; Heller. R.A. et al. (1997) Froc. Nafl. Acad. Sci. USA 94:2150-2155; 
Heller. MJ. et al. (1997) U.S. Patent No. 5,605.662). Various types of microa,«iys are weU known 
and thoroughly described in Schena. M.. ed. (1999; DNA Mic«»m.v.. A Ap p».,.^ Oxfoid 

University Press, London). 
15 In another embodiment of the invention, nucleic acid sequences encoding KPP may be used 

to generate hybridization probes useful in mapping the naturally occurring genomic sequence. . Eitiier 
coding or noncoding sequences may be used, and in some instances, noncoding sequences may be 
preferable over coding sequences. For example, conservation of a coding sequence among members 
of a multi-gene family may potentially cause undesired cross hybridization during chromosomal 
20 mapping. The sequences may be mapped to a particular chromosome, to a specific region of a 

chromosome, or to artificial chromosome constructions, e.g.. human artificial chromosomes (HACs). 
yeast artificial chromosomes (YACs). bacterial artificial chromosomes (BACs), bacterial PI 
constructions, or single chromosome cDNA libraries (Jfarrington, J J. et al. (1997) Nat. Genet 
. 15:345-355; Price, CM. (1993) Blood Rev. 7:127-134; Trask. B.J. (1991) Trends Genet. 7:149-154). 
25 Once mapped, tiie nucleic acid sequences may be used to develop genetic linkage maps, for example, 
which correlate the inheritance of a disease state with the inheritance of a particular chromosome 

region or restriction fiiagment length polymorphism (RFIJP).(Lander.,E.S. and.D. Jlotstein (15S6). 

Ph)c. Natl. Acad. Sci, USA 83:7353-7357). 

Fluorescent in situ hybridization (HSH) may be correlated with other physical and genetic 
30 map data (Heinz.UWch.etal. (1995) in Meyei^,,«pr^ pp. 965-968). Examples of genetic map data 
can be found in various scientific journals or at the Online Mendelian Inheritance in Man (OMIM) 
World Wide Web site. CoirelatioribetWeebtHe location of the gene encoding KPP on a physical map 
and a specific disorder, or a predisposition to a specific disorder, may help define tiie region of DNA 
associated with that disorder and thus may furtiier positional cloning efforts. 
35 In situ hybridization of chromosomal preparations and physical mapping techniques, such as 
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linkage analysis using established chromosomal marlfiers, may be used for extending genetic maps. 
Often the placement of a gene on the chromosome of anodier mammalian species, such as mouse, 
may reveal associated markers even if the exact chromosomal locus is not known. This information 
is valuable to investigators searching for disease genes using positional cloning or other gene 
5 discoveiy techniques. Once the gene or genes responsible for a disease or syndrome have been 
crudely localized by genetic linkage to a particular genomic region. e.g., ataxia-telangiectasia to 
1 lq22-23, any sequences mapping to that area may represent associated or regulatoiy genes for 
further investigation (Oatti. R.A. et al. (1988) Nature 336:577-580). The nucleotide sequence of the 
! instant invention may also be used to detect differences m the chromosomal location due to 
10 translocation, inversion, etc.. among nonnal, carrier, or affected individuals. 

In another embodiment of the inventicm. KPP, its catalytic or inmumogenic fiagments, or 
oligopeptides thereof can be used for sheening Ubiaries of compounds in any of a variety of drug 
screening techniques. The fragment employed in such screening may be free in solution. affixed to a 
solid support, borne on a ceU surface, or located intracellularly. The formation of binding complexes 
15 between KPP and the agent being tested may be measured. 

Another technique for drug screening provides for high throughput screening of confounds 
having suitable binding affinity to the protein of interest (Geysen, et aL (1984) PCT application 
WO84/03564). In tfiis method, large numbers of different small test compounds are synthesized on a. 
solid substrate. The test compounds are reacted with KPP. or fragments thereof, and washed. Bound 
KPP is then detected by mediods well known in the art Purified KPP can also be coated directfy onto 
plates for use in the aforementioned drug screening techniques. Alternatively, non-neutralizing • 
. antibodies can be used to capture the peptide and immobilize it on a solid support. 

In another embodiment, one may use competitive drug screening assays in which neutralizing 
antibodies capable of binding KPP specifically compete with a test compound for binding KPP. In 
25 this manner, antibodies can be used to detect the presence of any peptide which shares one or more 
antigenic determinants with KPP. 

In additicMial embodiments, the nucleotide sequences which encode KPP rray be used in any _ 

molecular biology techniques that have yet to be developed, provided the new techniques rely on 
iwoperties of nucleotide sequences that are cuirentiy known, including, but not limited to, such 
30 properties as the triplet genetic code and specific base pair interactions. 

Witiiout further elaboration, it is believed that one skilled in the art can, using the preceding 

description, utilize the present invention to its fullest extent The foltowing embodiments are. 
therefore, to be construed as merely Illustrative, and not limitative of die remainder of the disclosure 
in any way whatsoever. 

The disclosures of all patents, applications and publications, mentioned above and below, are 
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expressly tncoiporated by reference herein. 

EXAMPLES 

I. Construction of cDNA Libraries 

5 Incy te cDNAs are derived lix5in cDNA libraries described in the LIFBSEQ database (Incyte, 

Palo Alto CA) and shown in Table 4, column 3. Some tissues are homogenized and lysed in 
guanidiniumisothiocyanate, while others are homogenized and lysed in phenol or in a suitable 
mixture of denaturants, such as TRIZOL (Invitrogen), a monophasic solution of phenol and guanidine 
isothiocyanate. The resulting lysates are centrifuged over CsCl cushions or extracted with 

10 chlorofoim RNA is precipitated from the lysates with either isopropanol or sodium acetate and 
ethanol, or by other routine methods. 

Phenol extraction and precipitation of RNA are repeated as necessary to increase RNA purity. 
In some cases, RNA is treated with DNase. For most libraries, poly(A)+ RNA is isolated using oligo 
d(T)-coupled paramagnetic particles (Promega), OLIGOTEX latex particles (QIACHBN, Chatsworth 

15 CA), or an OLIGOTEX mRNA purification kit (QIAGEN). Alternatively, RNA is isolated ditecdy 
from tissue lysates using other RNA isolation kits, e.g., the POLY(A)PURE mRNA purification kit 
(Ambion, Austin TX). 

In some cases, Stratagene is provided with RNA and constructs the corresponding cDNA 
libraries. Otherwise, cDNA is synthesized and cDNA libraries are constructed vrith the TJNIZAP 
20. vector system (Strata^ne) or SUPEEISCRIPT plasnrid system (Invitrogen), using the recommended 
procedures or similar methods known in the art (Ausubel et al., supra, ch. S). Reverse transcription is 
initiated usmg oligo d(T) or random primers. Synthetic oligonucleotide adapters are ligated to double 
stranded cDNA, and the cDNA is digested with the appropriate restriction enzyme or en^mes. For 
most libraries, the cDNA is size-selected (300-1000 bp) using SEPHACRYL SIOOO, SEPHAROSE 
25 CL2B, or SEPHAROSE CL4B column chromatography (Amersham Biosciences) or preparative 
agarose gel electrophoresis. cDNAs are ligated into conq>atibIe restriction enzyme sites of die 

polylhiker of a suitable plasmid, e.g., PBLUESCRIPT plasmid (Stratagene), PSPQRXLplasnud^ 

- " (Invitro^n, Carlsbad CA). PCa5NA2.1 plasnud (Invitrogen), PBK-CMV plasmid (Stratagene), PCR2- 
. TOPOTA plasnud (hivitrogen), PCMV-ICIS plasmid (Stratagene), pIGEN (Incyte, Palo Alto CA), 
30 pRARE (bicyte), or pINCY (Incyte), or derivatives thereof. Recombinant ptasmids are transformed • - 
into competent E. coli cells including XLl-Blue, XLl-BIueMRF, or SOLR from Stratagene or DHSa, 
DHIOB, or ElectroMAX DHIOB from Invitrdgen. 
n. Isolation of cDNA Clones 

Plasmids obtained as described in Example I are recovered from host cells by in vivo excision 
35 using the UNIZAP vector system (Stratagene) or by cell lysis. Plasmids are purified using at least 
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one of the following: a Magic or WIZARD Minipreps DNA purification system (Promega); an AGTC 
Miniprep purification kit (Edge Biosystems, Gaithersburg MD); and QIAWELL 8 Plasnrid, 
QIAWELL 8 Plus Plasmid, QIAWELL 8 Ultra Plasmid purification systems or the R.E. AX. PREP 96 
plasmid purification kit ftom QIAGEN. Following precipitation, plasraids are lesuspended in 0. 1 ml 
5 of disdlled water and stored, with or without lyophilization, at 4°C. 

Alternatively, plasmid DNA is amplified from host cell lysates using direct linkPCR in a 
high-throughput format (Rao. V.B. (1994) Anal Biochenv 216: 1-14). Host cell'lysis and thermal 
cycling steps are carried out in a single reaction mixture. Samples are processed and stored in 384- 
well plates, and the concentration of amplified plasmid DNA is quantified fluororaetrically usmg 
10 PICOGREEN dye (Molecular Probes. Eugene OR) and a FLUOROSKAN H fluorescence scanner 
(Labsystems Oy, Helsinki, Finland), 
in* Sequencing and Analysis 

Incyte cDNA recovered in plasmids as described in Example II are sequenced as follows. 
Sequencing reactions are processed using standard methods or high-throughput instrumentation such 
15 as the ABI CATALYST 800 (Applied Biosystems) thermal cycler or the PTC-200 thermal cycler (MJ 
Research) in conjunction with the HYDRA microdispenser (Robbins Scientific) or the MICROLAB 
2200 (Hamilton) liquid transfer system. cDNA sequencing reactions are prepared using reagents 
provided by Amersham Biosciences or supplied in ABI sequencing kits such as the ABI PRISM 
BIGDYE Terminator cycle sequencing ready reaction kit (Applied Biosystems). Electrophoretic 
20 separation of cDNA sequencing reactions and detection of labeled polynucleotides are carried out 

using the MEGAB ACE 1000 DNA sequencing system (Amersham Biosciences); the ABI PRISM 373 
or 377 sequencing system (Applied Biosystems) in conjunction with standard ABI protocols and base 
calling software; or other sequence analysis systems known in the art, Reading fiames within the 
cDNA sequences are identified using standard methods (Ausubel et al.. supra, ch. 7). Some of the 
25 cDNA sequences are selected for extension using the techniques disclosed in Example VIEL 

Polynucleotide sequences derived from Incyte cDNAs are validated by removing vector, 
linker, and poly(A) sequences and by masWng ambiguous, bases, jising^algorithnis andpiograms - 
' based on BLAST, dynamic progranmung, and dinucleotide nearest neighbor analysis. The Incyte 
cDNA sequences or translations thereof are then queried against a selection of public databases such 
30 as the GenBank primate, rodent, mammalian, vertebrate, and eukaryote databases, and BLOCKS, 
PRINTS, DOMO, PRODOM; PROTEOME databases with sequences from Homo sapiens, Ratius 
norvegicus, Mas musculus, Caenorhabditis elegans, Sdccharomyces cerevisiaei Schizosaccharornyces 
pombe, and Candida albicans ancyte, Palo Alto CA>; hidden Markov model (HMM)-based protein 
family databases such as PFAM, INCY, and TIORFAM (Haft. D,H. et al. (2001) Nucleic Acids Res. 
35 29:41-43); and HMM-based protein domain databases such as SMART (Schultz, J. et al. (1998) Proc. 
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Nafl. Acad. Sci. USA 95:5857-5864; Letunic. L et al. (2002) Nucleic Acids Res. 30:242-244). 
(HMM is a probabilistic approach which analyzes consensus primary structures of gene families; see. 
for example. Eddy. SJl. (1996) Cutr. Opin. Struct Biol. 6:361-365.) The queries are performed 
using programs based on BLAST. FASTA. BLIMPS, and HMMER. The Incyte cDNA sequences are 
5 assembled to produce fiiU length polynucleotide sequences. Alternatively. GenBank cDNAs. 

GenBank ESTs. stitched sequences, stretched sequences, or Genscan-predicted coding sequences (see 
Examples IV and V) are used to extend Ihcyte cDNA assemblages to fall length. Assembly is 
performed using programs based on Phied. Phrap. and Consed, and cDNA assemblages are screened 
for open reading frames using prx^ams based on GeneMark. BLAST, and PASTA. The fall length 
10 polynucleotide sequences are translated to derive the corresponding fall length polypeptide 

sequences. Alternatively, a polypeptide may begin at any of flie metiiionine residues of the fuU length 
translated polypeptide. Full length polypeptide sequences are subsequentiy analyzed by querymg 
against databases such as tfie GenBank protein databases (genpept). SwissPtot. tiie PROIEOME 
databases, BLOCKS, PRINTS, DOMO, PRODOM. ftosite. hidden Markov model (HMM)-based 
15 protein family databases such as PFAM. INCY. and TIGRFAM; and HMM-based protein domain 
databases such as SMART. Full length polynucleotide sequences are also analyzed using 
MACDNASIS PRO software (MiraiBio. Alameda CA) and LASERGENE software (DNASTAR). 
Polynucleotide and polypeptide sequence alignments are generated using defeult parameters specified 
by the CLUSTAL algorithm as incoiporated into the MEGALIGN multisequence aligmnent program 
20 (DNASTAR), which also calculates tfie percent identity between aligned sequences. 

Table 7 summarizes tools, programs, and algoriUuns used for die analysis and assembly of 
' Incyte cDNA and fall length sequences and provides applicable descriptions, references, and 

threshold parameters. The fu-st column of Table 7 shows the tools, programs, and algoriflm«s used, 
tfie second column provides brief descriptions tiiereof. tiie third column presents appropriate 
25 references, all of which are incorporated by reference herein in Uieir entirety, and tiie fourth column 
presents, where applicable, the scores, probability values, and other parameters used to evaluate die 
strengdi of a match between two sequences (die higher the score die lower die.prob%bility,valuer. - 
the greater the identity between two sequences). 

The programs described above for the assembly and analysis of fall length polynucleotide 
30 and polypeptide sequences are also used to identify polynucleotide sequence fragments from SEQ ID 
NO:6.10. Fragments from about 20 to about 4000 nucleotides which are usefal in hybridization and 

amplification technologies are described in Table 4, column 2. 

IV. Identification and Editing of Coding Sequences from Genomic DNA 

Putative kinases and phosphatases are initially identified by running tiie Genscan gene 
35 identification program against public genomic sequence databases (e.g., gbpri and gbhtg). Genscan is 



81 



ft. 



PF-t688P 



a general-pmpose gene identification progiam which analyzes genomic DNA sequences from a 
variety of oiganlsnB (Buige. C. and S. Karlin (1997) J. Mol. Biol. 268:78-94; Burge. C and S. Karlin 
(1998) Curr. Opin. Struct Biol. 8:346-354). The piognim concatenates predicted exons to foimai, 
assembled cDNA sequence extending ftom a methionine to a stop codon. The output of Genscan is a 
3 FASTA database ofpolynucleotide and polypeptide sequences. The maximum range of sequencb for 
Genscan to analyze at once is set to 30 kb. To detemiine which of tfiese Genscan predicted cDNA . 
sequences encode kinases and phosphatases, the encoded polypeptides are analyzed by querying 
against PFAM models for kinases and phosphatases. Potential kinases and phosphatases are also 
identified by homology to Incyte cDNA sequences that have been annotated as kinases and 
10 phosphatases. These selected Genscan-predicted sequences are then compared by BLAST analysis to 
the genpept and gbpri public databases. Where necessary, flie Genscan^jredicted sequences are flien 
• edited by comparison to die lop BLAST hit ftom genpq,t to conect enors in Uie sequence predicted 
by Genscan. such as extra or omitted exons. BLAST analysis is also used to find any Ihcyte cDNA or 
public CDNA coverage of tbc Genscan-predicted sequences, tfius providing evidence for transcription. 
.15 When Incyte cDNA coverage is available, this information is used to correct or confirm tf» Genscan 
predicted sequence. Full lengti, polynucleotide sequences are obtained by assembling Genscan- 
predicted coding sequences with Incyte cDNA sequences aid/or public cDNA sequences usmg die 
assembly process described in Example m. Alternatively, full lengtfi polynucleotide sequences are 
derived entirely from edited or unedited Genscan-predicted coding sequences. 
20 y- Assembly of (Jenomic Sequence Data with cDNA Sequence Data 

"Stitched" St^ it>n^«« 

Partial cDNA sequences are extended with exons predicted by the Genscan gene 
identification program described in Example IV. Partial cDNAs assembled as described in Example 
m are mapped to genomic DNA and parsed into clusters containing related cDNAs and Genscan exon 

25 predictions from one or more genomic sequences. Each cluster is analyzed using an algoritfm* based 
on graph theory and dynamic programming to integrate cDNA and genomic information, generating 
possible splice variants that are subsequenUy confirmed, ed^ted..Qr extended to creatpui fulUengU.-— 
sequence. Sequence intervals in which the entire length of the Interval is present on more than one 
sequence m Uie cluster are identified, and intervals thus identified are considered to be equivalent by 

30 transitivity. R>r example, if an intervalis present on a cDNA and two genomic sequences, then aU 
three mtervals are considered to be equivalent. This process allows unrelated but consecutive 
genomic sequences to be brought together, bridged by cDNA sequence. Intervals thus identified are 
then "stitched" together by tfie stitching algoritiim in the order that tfiey appear along their parent 
sequences to generate die longest possible sequence; as well as sequence variants. Linkages between 

35 intervals which proceed along one type of parent sequence (cDNA to cDNA or genomic sequence to 
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genomic sequence) ate given preference over linkages which change parent type (cDNA to genomic 
sequence). The resultant stitched sequences are translated and compared by BLAST analysis to the 
genpept and gbpri public databases. Incorrect exons predicted by Genscan are corrected by 
comparison to the top BLAST hit from genpept. Sequences are further extended with additional 

5 cDNA sequences, or by inspection of genomic DNA, when necessary. 
^*Stretched^^ Sequences 

Partial DNA sequences are extended to full length with an algorithm based on BLAST 
analysis. First, partial cDNAs assembled as described in Example HI are queried against public 
databases such as the GenBank primate, rodent, mamnudian, vertebrate, and eukaryote databases 

10 using the BLAST program. The nearest GenBank protein homolog is then compared by BLAST 
analysis to either Incyte cDNA sequences or GenScan exon predicted sequences described in 
Example IV. A chimeric protein is generated by using the resultant high-scoring segment pairs 
(HSPs) to map the translated sequences onto the GenBank protein homolog. Insertions or deletions 
may occur in the chimeric protein with respect to the original GenBank protein homolog. The 

15 • GenBank protein homolog, the chimeric protein, or both are used as probes to search for homologous, 
genomic sequences from the public human genome datiAases. Partial DNA sequences are therefore 
^'stretched'* or extended by the addition of homologous genomic sequences. The resultant stretched 
sequences are examined to determine whether tiiey contain a conq)lete gene. 
. yi« Chromosomal Mapping of KPP Encoding Polynucleotfdes 

20 The sequences used to assenible SEQ ID NO:6-10 are compared with sequences from the 

Incyte LIFESBQ database and public domain databases using BLAST and other implementations of 
the Smith-Waterman algorithm. Sequences from ttese databases that matched SEQ ID NO:6-10.are 
assembled into clusters of contiguous and overlapping sequences using assembly algorithms such as 
Phrap (Table 7). Radiation hybrid and genetic mapping data available from public resources such as 

25 the Stanford Human Genome Center (SHGC), Whitehead Institute for Genome Research (WICBR), 
and G6n6thon are used to determine if any of the clustered sequences have been previously mapped. 
Inclusion of a mapped sequence in a cluster results in th^assignmentpf all-sequences of that cluster, 
including its particular SEQ ID NO:, to that map location. 

* Miq[>location'sarerepresentedbyranges, or intervals, of human chromosomes. The map* * 

30 position of an interval, in centiMorgans, is measured relative to the terminus of the chromosome's p- 
ann. (The centiMorgan (cM) is a unit of measurement based on recombination firequencies between 
* chromdsohial markers. On aveirage, 1 cM is roughly equivalent to 1 megabase (Mb) of DNA in 
humans, although this can vary widely due to hot and cold spots of recombination.) The cM 
distances are based on genetic markers mapped by G6n6thon which provide boundaries for radiadon 

35 hybrid markers whose sequences were included in each of the clusters. Human genome maps and 
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Other resources available to the public, such as the NCBI "GeneMap'99" World Wide Web site 
(http-7/www.ncbi.nlm.nih.gov/genemap/), can be employed to determine if previously identified 
disease genes map within or in proximity to the int^als indicated above. 
VII* Analysis of Polynucleotide Expression 
5 Northern analysis is a laboratory technique used to detect the presence of a transcript of a 

gene and involves the hybridization of a labeled nucleotide sequence to a membrane on which RNAs 
from a particular cell type or tissue have been bound (Sambrook and Russell, supra^ ch. 7; Ausubel et 
at, supra^ ch, 4)- 

Analogous computer techniques applying BLAST are used to search for identical or related 
10 molecules in databases such as GenBank or LIFESEQ (Incytc). This analysis is much faster than 
multiple membrane-based hybridizations. In addition, the sensitivity of the coiriputer search can be 
modified to detennine whether any particular match is categorized as exact or similar. The basis of 
the search is the product score, which is defined as: 

15 • BLAST Score x Percent Identity 

, .. 5 X minimum {length(Seq. I), length(Seq. 2)) 

/ 

The product score takes into account both the degree of sunilarity between two sequences and the 
length of the sequence match. The product score is a normalized value between 0 and 100, and is 
20 calculated as follows: the BLAST score is muldplied by the percent nucleotide identity and the 
product is divided by (5 times the length of the shorter of the two sequences). The BLAST score is 
calculated by assigning a score of +5 for every base that matches in a high-scoring segment pair 
QJSPX and -4 for every nusmatch. Two sequences may share more than one HSP (separated by 
gaps). . If there is more than one HSP, then the pair with the highest BLAST score is used to calculate 

25 the product score. The product score represents a balance between firactional overlap and quality in a 
BLAST alignment. For exan^le, a product score of 100 is produced only for 100% identity over the 
entire length of the shorter of the two sequences being compared. A product score.qf JO is prpduccd- - 
Either by iOO% identity and 70% overlap at one end, or by 88% identic and 100% overiap at the 
other. A product score of 50 is produced either by 100% identity and 50% overlap at one end, or 79% 

30 identity and 100% overlap. 

Alternatively, polynucleotides encoding KPP are analyzed with respect to the tissue sources 
bom which they are"derived. For example, some full length sequences are assembled, at least in part, 
with overiapping Incyte cDNA sequences (see Example HI). Each cDNA sequence is derived fiom a 
cDNA library constructed from a human tissue. Each human tissue is classified into one of the 

35 following organ/tissue categories: cai^iovascular system; connective tissue; digestive system; 
embryonic structures; endocrine system; exocrine glands; genitalia, female; genitalia, male; germ 
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cells; hemic and iimnune system; liver, musculoskeletal system; nervous system; pancreas; 
respiratory system; sense organs; skin; stomatognathic system; unclasstfled/mixed; or urinary tract 
The number of libraries in each category is counted and divided by the total number of libraries 
across ail categories. Similarly, each human tissue is classified into one of the following 

5 disease/condition categories: cancer, cell line, developmental, inflammation, neurological, trauma, 
cardiovascular, pooled, and other, and the number of libraries in each category is counted and divided 
by the total numt>er of libraries across. all categories. The resulting percentages reflect the tissue- and 
disease-specific expression of cDNA encoding KPP. cDNA sequences and cDNA library/tissue 
information are found in die LIFESEQ database (Incyte, Palo Alto CA). 

10 Vni. Extension of KPP Encoding Polynucleotides 

Fall length polynucleotides are produced by extension of an appropriate fragment of the fiill 
length molecule using oligonucleotide primers designed from this fragnoent One primer is 
synthesized to initiate S' extension of the known fragment, and the other primer is synthesized to 
initiate 3' extension of the known fragment The initial primers are designed using OLIGO 4.06 

15 software (National Biosciences), or another appropriate program, to be about 22 to 30 nucleotides in 
length, to have a GC content of about S0% or more, and to anneal to the target sequence at 
ten^ratuies of about dS^'C to about 72 "^C Any stretch of nucleotides which would result in haiipin 
structures and priniier-primi^ dimerizaticns is avoided. 

Selected human cDNA libraries are used to extend the sequence. If more than one extension 

20 is necessary or desired, additional or nested sets of primers are designed. 

High fidelity amplification is obtained by PCR using methods well known in the art. PCR is 
performed in 96-well plates using die PTC-200 thermal cycler (MJ Research, Inc.). • The reaction mix 
contains DNA template, 200 nmol of each primer, reaction buffer containing Mg^^, (NH4)2S04, and 2- 
mercaptoedianoU Taq DNA polymerase (Amersham Biosciences), ELONGASE enzyme Qhvitrogen), 

25- and Pfii DNA polymerase (Stratagene), with the following parameters for primer pair PCI A and PCI 
B: Step 1: 94*'C 3 min; St^ 2: 94^C, 15 sec; Step 3: 60''C, 1 min; Step 4: eS^'C, 2 min; Step 5: Steps 

2, 3, and 4 repeated 20 times; Step 6: 68^*0, 5 min; Step 7: storage at 4**C Jh the.alt5matiye,^tfie. .1 

parameters for primer pair T7 and SK+ are as follows: Step 1: 94 ''C, 3 min; Step 2: 94°C 15 sec; 
Step.3: 57**C, 1 min; Step 4: 68 ''C 2 min; Step 5: Steps 2, 3. and 4 repeated 20 times; Step 6: eS^^C, 5 

30 min; Step 7: storage at 4''C. 

The concentration of DNA in each well is determined by dispensing 100 ^1 PICOGREBN 
- quantitation reagent (0.25% (v/v) PICOGREBN; Molecular Probes, Eugene OR) dissolved in IX TE 
and 0.5 ii\ of undiluted PCR product into each well of an opaque fluorimeter plate (Coming Costar, 
Acton MA), allowing the DNA to bind to the reagent. The plate is scanned in a Fluoroskan n 

35 (Labsystems Oy, Helsinki, Finland) to measure the fluorescence of the sample and to quantify the 
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concentration of DNA. A 5 a^I to 10 ju\ aliquot of the reaction mixture is analyzed by electrophoresis 
on a 1 % agarose gel to determine which reactions are successful in extending the sequence. 

The extended nucleotides are desalted and concentrated, transferred to 384-welI plates, 
digested with CviJI cholera virus endonuclease (Molecular Biology Research, Madison WI), and 

5 sonicated or sheared prior to religation into pUC 18 vector (Amersham Biosciences). For shotgun 
. sequencing, the digested nucleotides are separated on low concentration (0.6 to 0.8%) agarose gels, 
fragments are excised, and agar digested with Agar ACE (Promega). Extended clones are religated 
using T4 ligase (New England Biolabs, Beverly MA) into pUC 18 vector (Amersham Biosciences), 
treated with Pfu DNA polymerase (Stratagene) to fiil4n restriction site overhangs, and transfected 

10 into con^jetent E. coli cells. Transformed cells are selected on antibiotic-containing media, and 

individual colonies are picked and cultured overnight at 37 ""C in 384-well plates in LB/2x carb liquid 
media. 

The cells are lysed, and DNA is amplified by PGR using Taq DNA polymerase (Amersham 
Biosciences) and Pfu DNA polymerase (Stratagene) with the following parameters: Step 1: 94*^0, 3 

15 min; Step 2: 94'*C. 15 sec; Step 3: eO'^CV 1 min; Step 4: IT'C^ l man; Step 5: steps 2, 3, and 4 

repeated 29 times; Step 6: 72**C, 5 min; Step 7: storage at 4^*0, DNA is quantified by PICOGREEN 
reagent (Molecular Probes) as described above. Samples with low DNA recoveries are reamplified 
using the same conditions as described above. Samples are diluted with 20% dimethysulfoxide (1:2, 
v/v), and sequenced using DYENAMIC energy transfer sequencing primers and the DYENAMIC 

20 DIRECT kit (Amersham Biosciences) or the ABI PRISM BIGDYE Terminator cycle sequencing 
ready reaction Idt (Applied Biosystems). 

bi like manner, full length polynucleotides are verified using the above procedure or are used- 
to obt£un 5* regulatory sequences using the above procedure along with oligonucleotides designed for' 
such extension^ and an appropriate genomic library. 

25 K. - Identification of Single Nucleotide Polymorphisms In KPP Encodb^ Polynucleotides 
Common DNA sequence variants known as single nucleotide polymorphisms (SNPs) are . 
identified in SEQ ID NO:6-10 using the LIFESEQ database (Incyte). Sequences fi*onL.the.§ameTgene— 
are clustered together and assembled as described in Example m, allowing the identification of all 
sequence variants in the gene. An algorithm consisting of a series of filters is used to distinguish • * 

30 SNPs fiiom other sequence variants. Preliminary filters remove the majority of basecall errors by 
requiring a minimum Phred quality score of 15, and remove sequence alignment errors and errors 
' lesiilting from irhproper trimming of vector sequences, chimeras, and splice variants. An automated 
procedure of advanced chromosome analysis is applied to the original chromatogram files in die 
vicinity of the putative SNP. Clone error filters use statistically generated algorithms to identify 

35 errors introduced during laboratory processing, such as those caused by reverse transcriptase, 
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polymerase, or somatic mutation. Clustering error filters use statistically generated algorithms to 
identify errors resulting from clustering of close homologs or pseudogenes, or due to contamination 
by non-human sequences. A final set of filters removes duplicates and SNPs found in 
immunoglobulins or T-cell receptors. 

5 Certain SNPs are selected for further characterization by mass spectrometry using the high 

throughput MASSARRAY system (Sequenom, Inc.) to analyze allele frequencies at the SNP sites in 
four different human populations. The Caucasian population comprises 92 individuals (46 male, 46 
fenuile), including 83 from Utah, four French, three Venezualan» and two Anush individuals. The 
African population comprises 194 individuals (97 male, 97 female), all African Americans. The 

10 Ifispanic population comprises 324 individuals (162 male, 162 female), all Mexican Hispanic. The 
Asian population comprises 126 individuals (64 male, 62 female) with a reported parental breakdown 
of 43% Chinese, 31% Japanese, 13% Korean, 5% Vietnamese, and 8% other Asian. Allele 
frequencies are first analyzed in the Caucasian population; in some cases those SNPs which show no 
allelic variance in this population are not furdier tested in the other three populations. 

15 X. Labeling and Us6 of Indlvidoal Hybridization Probes 

Hybridization probes derived from SEQ ID NO:6-10 are employed to screen cDNAs, 
genomic DNAs, or mRNAs. Although the labeling of oligonucleotides, consisting of about 20 base- 
pairs, is specifically described, essentially the same procedure is used with larger nucleotide 
fragments. Oligonucleotides are designed using state-of-the-art software such as OLIGO 4.06 

20 software (National Biosciences) and labeled by combining 50 pmol of each oligomer, 250 ^Ci of 

[yJ2p] adenosine triphosphate (Amersham Biosciences), and T4 polynucleotide kinase (DuPont NEN, 
Boston MA). The labeled oligonucleotides are substantially purified using a SEPHADEX G-25 
superfme size exclusion dextran bead column (Amersham Biosciences). An aliquot containing 10^ 
counts per minute of the labeled probe is used in a typical membrane-based hybridization analysis of 

25 human genomic DNA digested with one of the following endonucleases: Ase I, Bgl H, Eco RI, Pst I, 
Xba I, or Pvu II (DuPont NEN). 

The DNA from each digest is fractionated on ajO.7% agarose gel-and transfeifed'tonylra 
' membranes (Nytran Plus, Schleicher & Schuell, Durham NH). Hybridization is carried out for 16 
hours at 40^*0 To remove nonspecific signals, blots are sequentially washed at room temperature 

30 under conditions of up to, for example, 0. 1 x saline sodium citrate and 0.5% sodilim dodecyl sulfi&te. 
Hybridization patterns are visualized using autoradiography or an alternative imaging means and 
compared. 

XI» Microarrays 

The linkage or synthesis of array elements upon a microarray can be achieved utilizing 
35 photolithography, piezoelectric printing (ink-jet printing; see, e.g., Baldeschweiler et al., supra). 
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mechanical microspotting technologies, and derivatives thereof. The substrate in each of the 
aforementioned technologies should be uniform and solid with a non-porous surface (Schena, M., ed, 
(1999) DNA Microarravs: A Practical Approach ,.Oxford University Press. London). Suggested 
substrates include silicon, silica, glass slides, glass chips, and silicon wafers. Alternatively, a 

5 procedui^ analogous to a dot or slot blot may also be.used to arrange and link elements to the surface 
of a substrate using thermal, UV, chemical, or mechanical bonding procedures. A typical array may 
be produced using available methods, and machines well known to those of ordinary skill in the art 
and may contain any appropriate number of elements (Schena, M- et al. (1995) Science 270:467-470; 
Shalon, D. et al. (1996) Genome Res. 6:639-645; Marshall, A. and J. Hodgson (1998) Nat 

10 Biotechnol. 16:27-31). 

Full length cDNAs, Expressed Sequence Tags (ESTs), or fragments or oligomers thereof may 
comprise the elements of the microarray. Fragments or oligomers suitable for hybridization can be 
selected usmg software well known in the art such as LASERGENE software (DNASTAR). The 
array elements are hybridized with polynucleotides in a biolo^cal sanq>le. The polynucleotides in 

15 the biological sample are conjugated to a fluorescent label or other molecular tag for ease of 

detection. After hybridization, nonhybridized nucleotides from the biological sanq>Ie are removed, 
and a- fluorescence scanner is used to detect hybridization at c^ch array element. Alteinatively, laser 
desorbtion and mass spectrometry may be used for detection of hybridization. The degree of 
complementarity and the relative abundance of each polynucleotide which hybridizes to an element 

20 on the microarray may be assessed. In one embodiment, microarray preparation and usage is 
described in detail below. 
Tissue or Cell Sample Preparation 

Total KNA is isola^ from tissue samples using the guanidinium thiocyanate method and 
poly(A)* RNA is purified using the oligo-(dT) cellulose method. Each poly(A)* RNA sample is 

25 reverse transcribed using MMLV reverse-transcriptase, O.OS.pg/^1 oligo-(dT) primer (21mer), IX 
first strand buffer, 0.03 units/^il RNase inhibitor, 500 fiU dATP, 500 juM dGTP, 500 fiM. dTTP, 40 

jtiM dCTP, 40 |tM dCTP-Cy3 OBDS) or dCTP-C^5 (Amershain3iosciences)^ The-reverse r-r 

tmnscription' reaction is performed in a 25 ml volume containmg 200 ng poly(A)* RNA with 
GEMBRIGHT kits Oncyte). Specific control poIy(A)* RNAs are synthesized by in vitro transcription 

30 fiom non-coding yeast genomic DNA. After incubation at 3T C for 2 hr, each reaction sample (one 
with Cy3 and another with Cy5 labeling) is treated with 2.5 ml of 0.5M sodium hydroxide and 
incubated for 20 minutes at 8S'' C to the stop the'reaction and degrade the KNA. Samples' are purified 
using two successive CHROMA SPIN 30 gel filtration spin columns (BD Clontech, Palo Alto CA) 
and after combining, both reaction samples are ethanol precipitated using 1 ml of glycogen (1 

35 mg^ml), 60 ml sodium acetate, and 300 ml of 100% ethanol. The santple is then dried to completion 
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using a SpeedVAC (Savant Instraments Inc., Holbrook NY) and lesuspended fai 14 fd 5X SSao.2% 
SDS. 

Microarrav Preparj^f^nn 

Sequences of the present invention are used to geneiate anay elements. Each array element 
5 is ampUfled from bacterial ceHs containing vectors with cloned cDNA inserts. PGR amplification 
uses primers complementary to the vector sequences flanking the cDNA insert Array elements are 
amplified in thirty cycles of PC31 from an initial quantity of 1-2 ng to a final quantity greater than 5 
Itg. Amplified anay elements are then purified.using SEPHACRYL400 (Amersham Biosciences). 

Purified array elements are immobUized on polymer-coated glass slides. Glass microscope 
10. slides (Coming) are cleaned by ultrasound in 0.1% SDS and acetone, with extensive distilled water 
washes between and after treatments. Glass slides are etched in 4% hydrofluoric acid (VWR 
Scientific Products Corporation (VWR). West Chester PA), washed extensively in distiUed water. 

and coated with 0:05% arainopropylsilaneCSigma-Aldrich. St Louis MO) in 95% ethanol. Coated 
slides are cured m a 1 10°C oven. 

15 elements are applied to the coated glass substrate using a procedure described in U.S. 

Patent No. 5,807.522. mcoiporated herein by reference. 1 /«! of the aiiay element DNA, at an average 
concentration of 100 ng//d. is loaded mto the open capillary printmg element by a high-speed robotic 
apparatus. The apparatus then deposits about 5 nl of anay element sample per slide. 

Microanays are UV-crossIinked using a STRATAUNKER UV-crossUnker (Stratagene)! 

20 Microaixays are washed at room temperature once in 0.2% SDS and three times in distilled water. 
Non-specific binding sites are blocked by incubation of microanays in 0.2% casein in phosphate 
buffered saline (PBS) (Tropix, Inc., Bedford MA) for 30 minutes at eO'C followed by washes in 
0.2% SDS and distUled water as before. 
Hybridization 

25 Hybridization reactions contain 9 ill of sample mixture consisting of 0.2 jig each of Cy3 and 

Cy5 labeled cDNA synthesis products in 5X SSC. 0.2% SDS hybridization buffer. The sample 

_ . mixture is heated to 65° C for 5 minutes and-is aliquoled ontathe microanay surface aiid -coveied — 
withanl.8cm*coverslip. The arrays are transferred to a waterproof chamber having a caviiy just 
slightly larger than a microscope slide. The chamber is kept at 100% humidity Internally by the 

30 addition of 140 /tl of 5X SSC in a comer of the chamber. The chamber containing the arrays is 
incubatal for about 6.5 hours at eO'C. The anays are washed for 10 min at 45'C in a first wash 
buffer (IX SSC. 0.1 % SDS), three times for 10 minutes each at 45»C in a second wash buffer (0.1X 
SSC). and dried. 
Detection 

35 Reporter-labeled hybridization complexes are detected with a microscope equipped with an 



89 



PF-I688P 

Innova 70 mixed gas 10 W laser (Coherent, Inc., Santa Clara CA) capable of generating spectral lines 
at 488 nm for excitation of Cy3 and at 632 nm for excitation of Cy5. The excitation laser Ught is 
focused on the array using a 20X microscope objective (Nikon, Inc., Melville NY). The slide 
containing the array is placed on a computer-controlled X-Y stage on tiie micioscope and raster- 
scanned past the objective. The 1.8 cm x 1.8 cm anay used in the present example is scanned witii a 
resolution of 20 micrometers. 

In two sqarate scans, a mixed gas multiline laser excites tiie two fluorophores sequentially. 
Emitted light is split, based on wavelength, into two photomultiplier tube detedtois (PMT R1477, 
Hamamatsu Isotonics Systems. Bridgewater NJ) corresponding to tiie two fluorophores. 
Apprc^riate filters positioned between tiie array and the photonmltiplier tubes are used to filter the 
signals. The emission maxima of tiie fluorophores used are 565 nm for Cy3 and 650 nm for Cy5. 
Each array is typically scanned twice, one scan per fluorophore using tiie appropriate filtere at tiie 
laser source, altiiough tiie apparatus is capable of recording die qiectra fiom bofli fluorophores 
simultaneously. 

Hie sensmvity of tiie scans is typically calibrated using die signal intensity generated 1^ a 
cDNA control species added to the sample mixture at a known concentration. A specific location on 
ttie array contains a ccmiplementary DNA sequence, allowing the intensity of tiie signal at diat 
location to be correlated witfi a weight ratio of hybridiring species of 1:100.000. When two samples 
from different sources (e.g., representing test and control cells), each labeled wirii a differ«ait 
fluorophore. are hybridized to a single array for die purpose of identifying genes tiiat are 
diflerentially expressed, die calibration is done by labelmg samples of die calibrating cDNA widi tite 
two fluorophores and adding identical amounts of each to die hybridization nuxture. 

The ouQ>ut of die photomultiplier tube is digitized using a 12-bit RTI-835H analog-to-digital 
(A/D) conversion board (Analog Devices. Inc., Norwood MA) installed in an IBM-compatible PC 
computer. The digitized data are displayed as an image where die signal intensity is mapped using a 
linear 20-color transformation to a pseudocolor scale ranging from blue (low signal) to ted (high 
signal). The data is also analyzed quantitatively. Where two.difiFerent=fluorophoras=are excited^d - 
measured simultaneously, the data are first corrected for optical crosstalk (due to overlapping 
. emission spectra) between the fluorophores using each fluorophore's emission spectrum. 

A grid is superimposed over ttie fluorescence signal image such that die signal from each 
spot is centered in each element of die grid. The fluorescence signal widiin each element is then 
integrated to obtain a numerical value corresponding to die average intensity of the signal. The ' 
software used for signal analysis is the GEMTOOLS gene expression analysis program (Incyte). 
Array elements that exhibit at least about a two-fold change in expression, a signal-to-background 
ratio of at least about 2.5, and an element spot size of at least about 40%, are considered to be 
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dififerentially expressed. 
Expression . 

For example, expression of SEQ ID NO: 10 was down-regulated in diseased lung tissue versus 
nonnal lung tissue as determined by microairay analysis. Expression of SEQ ID NO: 10 was 
5 decreased at least two-fold in the lung tumor tissue with squamous cell carcinoma as compared to 
grossly uninvolved lung tissue from the same donor using a pair comparison experimental design. 
. Therefore, in various embodiments, SEQ ID NO: 10 can be used for one or more of the following: i) 
monitoring treatment of lung cancer, ii) diagnostic assays for lung cancer, and iii) developing 
. therapeutics and/or other treatments for lung cancer. 
10 Xn. Complementary Polynucleotides 

Sequences complementary to the KPP-encoding sequences, or any parts thereof, are used to 
detect, decrease, or inhibit expression of naturally occurring KPP. Although use of oligonucleotides 
comprising from about 15 to 30 base pairs is described, essentially the same procedure is used with 
smaller or with larger sequence fragments. Aijpropriate oligonucleotides are designed using OUGO 
15 4.06 software (National Biosciences) aivd tfie. coding sequence of KPP. To inhibit transcription, a 
complementary oligonucleotide is designed firran flie most unique 5' sequence and used to prevent 
promoter binding to tiie coding sequence. To inhibit translation, a complementary oligonucleotide is 
designed to prevent ribosomal binding to flie KPP-encoding transcript. 
<Xni. Espres^onof KPP 
20 Expression and purification of KPP is achieved using bacterial or >drus-based expression 

systems. For expression of KPP in bacteria. cDNA is subcloned into an appropriate vector containing 
an antibiotic resistance gene and an inducible promotar tiiat directs high levels of cDNA transcription. 
. Examples of such promoters include, but are not limited to, flie trp-hic (fac) hybrid promoter and tiie 
T5 OT T7 bacteriophage promoter In conjunction witii the lac operator regulatory element. 
25 Recombinant vectors are transfonmed into suitable bacterial hosts. e.g.,BL2ia>E3). Antibiotic 
resistant bacteria express KPP upon induction with isopropyl beta-D-thiogalactopyranoside (IPTG). 
Expression of KPP in eukaryotic cells is achieved by infecting insect or mamnMliOTcell Jioe? .witii^ 
' recombinant Autographiea califormca nuclear polyhediosis virus (AcMNPV). commonly known as 
baculovirus. The nonessential polyhedrin gene of baculovirus is replaced with cDNA encoding KPP 
30 by eittier homologous recombmation or bacterial-mediated transposition involving transfer plasmid 
intermediates. Viral infectivity is maintained and Uie strong polyhedrin promoter drives high levels 
of cDNA transcription. Recombinant baculovirus is used to infect Spodopterafrugiperda (Sf9) insect 
cells in most cases, or human hepatocytes. in some cases. Infection of ttie latter requires additional 
genetic modifications to baculovirus (Engelhanl, E.K. et al. (1994) Proc. Natl. Acad. Sci. USA 
35 91.3224-3227; Sandig. V. et al, (1996) Hum. Gene Ther. 7:1937-1945). 
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In most expression systems, KPP is synthesized as a fusion protein with, e.g., glutathione S- 
transferase (GST) or a peptide epitope tag, such as FLAG or 6-His, permitting rapid, single-step, 
afBnity-based purification of recombinant fusion protein from crude cell lysates. GST, a 26- 
kilodalton enzyme from Schistosoma japonicum, enables the purification of fusion proteins on 
5 immobilized glutathione under conditions that maintain protein activity and antigenicity (Amersham 
. Biosciences). Following purification, the GST moiety can be ptoteolytically cleaved from KPP at 
specificaUy engineered sites.. FLAG, an 8-amino acid peptide, enables immunoafRnity purification ' 
using commercially available monoclonal and polyclonal anti-FLAG antibodies (Eastman Kodak): 6- 
His, a stretch of six consecutive histidine residues, enables purification on metal-chelate resins 
10 (QL\GEN). Methods for protein expression and purification are discussed in Ausubel et al. (supra^ 
ch. 10 and 16). Purified KPP obtained by these methods can be used directly in the assays shown in 
Examples XVH. XVm, XK, XX, and XXI, where applicable, 
XIV. Functional Assays 

KPP function is assessed by expressing the sequences encoding KPP at physiologically • 
15 elevated levels in mammalian cell culture systems. cDNA is subcloned into a mammalian expression 
vector containing a strong promoter that drives high levels of cDNA expression. Vectors of choice 
include PCMV SPORT plasmid (Invitrogen, Carisbad CA) aiid PCR3.1 plasmid (Invitrogen), both of 
which contain the cytomegalovirus promoter. 5-10 ^g of recombinant vector are transiently 
transfected into a human cell line, for example, an endothelial or hematopoietic cell line, using either 
20 liposome formulations or electroporation. 1-2 fxg of an additional plasmid containing sequences^ 
encoding a marker protein are co-transfected. Expression of a^ marker protein provides a means to 
distinguish transfected cells from nontransfected cells and is a reliable predictor of cDNA expression 
from the recombinant vector. Marker proteins of choice include, e.g.. Green Fluorescent Protein 
(GFP; BD Clontech), CD64, or a CD64-GFP fusion protein. Flow cytometry (FCM), an automated, 
25 laser optics-based technique, is used to identify transfected cells expressing GFP or CD64-GFP and to 
evaluate the apoptotic state of the cells and other cellular properties. FCM detects and quantifies the 
uptake of fluorescent molecules that diagnose events preceding or coincident with cell death. These 
. events include changes in nuclear DNA content as measured by staining of DNA with propidium 
iodide; changes in cell size and granularity as measured by forward light scatter and 90 degree side 
30 light scatter, down-regulation of DNA synthesis as measuied by decrease in bromodeoxyuridine. 
uptake; alterations in expression of cell surface and intracellular proteins as measured by reactivity 
with specific antibodies; and alterations in plasma membrane composition as measured by the binding 
of fluQrescein-conjugated Annexin V protem to the cell surface. Methods in flow cytometry are 
discussed in Ormerod, M-G. (1994; Flow Cvtometrv- Oxford, New York NY). 

The influence of KPP on gene expression can be assessed using highly purified populations 
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of cells transfected with sequences encoding KPP and either CD64 or CD6M3FP. CD64 and CD64- 
GFP are expressed on the surfece of transfected cells and bind to conserved regions of human 
immunoglobulin G (IgO). Transfected cells ate efficienUy separated from nontransfected cells using 
magnetic beads coated with either human IgG or antibody against CD64 (DYNAL, Lake Success 
5 NY). mRNA can be purified from the cells using methods well known by those of skill in the art. 
Expression of mRNA encoding KPP and other genes of interest can be analyzed by northern analysis 
or microarray techniques. 

XV. Producticin of KPP Specific Antibodies 

KPP substantially purified using polyacrylamide gel electrophoresis (PAGE; see, e.g., 
10 . Harrington. M.G. (1990) Methods Enzymol. 182:488-495). or other purification techniques, is used to 
immunize animals (e.g., rabbits, mice, etc.) and to produce antibodies using standard protocols. ' 

Alternatively, the KPP amino acid sequence is analyzed using LASERGENE software 
(DNASTAR) to determine regions of high immunogenicity, and a corresponding oligopeptide is 
synthesized and used to raise antibodies by means known to those of skiU in the art Methods for 
15 selection of appropriate epitopes, such as those near the C-terminus or in hydrophilic regions are well 
described in the art (Ausubel et al., supra, cb. 1 1). 

Typically, oligopeptides of about 15 residues hi length are synOiesized using an ABI 431 A 
peptide synthesizer (AppUed Biosystems) using FMOC chemistry and coupled to KLH (Sigma- 
Aldrich. St. Louis MO) by reaction with N-maleimidobenzoyl-N-hydroxysucciniiinde ester (MBS) to 
20 increase immunogenicity (Ausubel et al., supra). Rabbits are immunized with the oligopeptide-KLH 
complex in complete Freund's adjuvant. Resulting antisera are tested for antipeptide and anti-KPP 
activity by. for example, binding the peptide or KPP to a substrate, blocking with 1% BSA. reacting 
with rabWt antiseia, washing, and reactii^g with radio-iodinated goat anti-raU>it 
XVI. Pkirlflcatioa of NaturaDy Occurriae KPP Vting SpedQc Antibodies 
25 Naturally occurring or lecombinant KPP is substantially purified by immnnoaffinity 

. chrornatography using antibodies specific for KPP. An immunoaffinity column is constructed by • 

covalently coi^ling anti-KPP antibody to an activated chroniatograehic.resin, suQh as CNB&activated 
SBPHAROSB(Amersham Biosciences). After the coupling, the lesan is blocked and washed 
according to the roanufiicturer's instructions. 

3D Media containing KPP are passed over the immunoaffinity column, and the column is washed 

under conditions that allow the preferential absoibance of KPP (e.g., high ionic strength buffers in the 
presence of detergent). The colun&i is eluted under conditions that disrupt antibody/KPP binding 
(e.g.. a buffer of pH 2 to pH 3. or a high concentration of a chaotrope, such as urea or thiocyanate 
ion), and KPP is collected. 

35 XVn. Identilication of Molecules Which Interact with KPP 
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KPP. or biologicaUy active fragments thereof, are labeled with '«! Bolton-Hunter reagent 
(Bolton A.e' and W.M. Hunter (1973) Biochem. J. 133:529-539). Candidate molecules previously 
arrayed in the v^ells of a multi-well plate are incubated with the labeled KPP, washed, and any wells 
with labeled KPP complex are assayed. Data obtained using different concentrations of KPP anxnscd 
5 to calculate values for the number, affinity, and association of KPP with the candidate molecules. 

Alternatively, molecules interacting with KPP are analyzed using the yeast two-hybtid 
system as described in Fields, S. and O. Song (1989; Nature 340:245-246). or using comflieicially 
available kits based on the two-hybrid system, such as the MATCHMAKER system (BD Clontech). 
■ ' KPPmay also be used in the PATHCALLING process (CuraGea Corp.. New Haven CT) 
10 which employs the yeast two-hybrid system in a high-throughput manner to deterntdne all interactions . 

between the proteins encoded by two large libraries of genes (Nandabalan. K. et al. (2000) U.S. 

Patent No. 6,057.101). 

XVni. Demotistration of KPP Activity 

. Generally, protein kinase activity is measured by quantifying the phosphorylation of a protem 
15 substrate by KPP in the presence of [Y-'^] ATP. KPP is incubated with the protein substra^ 

3^P-ATP. and an appropriate kinase buffer. The «P incorporated into the substrate is separated fix,m 
free «P-ATP by electrophoresis and the incorporated »P is counted using a radioisotope counter. The 
amountofincorporated-PisproportionaltotheactivityofKPP. A deteimination of the specific 
amino add residue phosphorylated is made by phosphoamino acid analysis of the hydrolyzed protem. 
20 h, one alternative, protein kinase activity is measured by quantifying the transfer of gamma 

phosphate from adenosine triphosphate (ATP) to a serine, threonine or tyrosine residue «i a protein 
substrate. ThereactionoccursbetweenaproteinkinasesamplewithabioUnylatedpeptidesubstrate • 
and gamma "P-ATP. FoUowing the reaction, free avidin in solution is added for binding to the 
biotinylated '^-peptide product. The binding sample then undergoes a centrifiigal ultrafiltration. 
25 processwithamembranewhichwillretahitheproduct-av.du.complexandallowpassageoffree 
gamna»P-ATP. The reservoir of the centrifbgedunU containing the '^-peptide product as retentate 
is then counted in a scintillation counter. This procedure allows the assay of anV type of protein__ _ 
kinase sample, depending on the peptide substrate and kinase reaction buffer selected. This assay is 
provided in kit form (ASUA. Affinity Ultrafiltration Separation Assay. Transbio Corporation. 
30 BaltimoreMD,U.S.PatentNo.5.869.275). Suggestedsubstratesandtheirrespectiveenrymes 

includeDut are not limited to: Histone HI (Sigma) and p34«-kinase, Annexin I. Angiotensin (Sigma) 
; - and EOF receptbr kinase. Annexin JL and .n: kinase, ERKl & ERK2 substrates and MEK. and myelm 
basic protein and ERK (Pearson. J.D. et al. (1991).Methods Enzymol. 200:62-81). 

In another alternative, protein kinase activity of KPP is demonstrated in an assay containing 
35 KPP, 50 Ml of kinase buffer. 1 »ig substrate, such as myelin basic protein (MBP) or syntiietic peptide 
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substrates, 1 mM DTT, 10 ^ig ATP, and 0.5 jiCi [y-^ATP. The reaction is incubated at 30**C for 30 
minutes and stopped by pipetting onto P81 paper. The unincorporated [y-^^JATP is removed by 
washing and the incorporated radioactivity is measured using a scintillation counter. Alternatively, 
the reaction is stopped by heating to 10D**C in the presence of SDS loading buffer and resolved on a 
5 12% SDS polyacrylamide gel followed by autoradiography. The amount of incoipoiated is 
proportional to the activity of KPP. 

In yet another alternative, adenylate kinase or guanylate kinase activity of KPP may be 
measured by the incorporation of ^ from [y-^^]ATP mto ADP or GDP using a gamma radioisotope 
counter. KPP, in a kinase buffer, is incubated together with the appropriate nucleotide 

10 mono-phosphate substrate (AMP or GMP) and '^-labeled ATP as the phosphate donor. The reaction 
is incubated at 37**C and teminated by addition of trichloroacetic acid. The acid extract is 
neutralized and subjected to gel electrophoresis to separate the mono-, di-, and triphosphonucleotide 
fractions. The diphosphonucleotide fraction is excised and counted. The radioactivity recovered is 
proportional to the activity of KPP. 

15 hi yet another alternative, other assays for KPP include scintillation proximity assays (SPA), 

scintillation plate technology and filt^ binding assays. Useful substrates include recombinant 
proteins tagged with glutathione transferase, or synthetic peptide substrates tagged wiUi biotin. 
Inhibitors of KPP activity, such as small oi:ganic molecules, proteins or peptides, may be identified by 
such assays. 

20 I" another alternative, phosphatase activity of KPP is measured by the hydrolysis of para- 

nitrophenyl phosphate (PNPP). KPP is incubated together with PNPP in HEPES buffer pH 7.5, in tiie 
presence of 0.1% p=-mercaptoethanol at SV'C for 60 min. The reaction is stopped by the addition of 6 
ml of 10 N NaOH (Diamond, R.H. et al. (1994) Mol. Cell. Biol. 14:3752-62). Alternatively, acid 
phosphatase activity of KPP is demonstrated by incubating KPP-containing extract with 100 /ii of 10 

•25 mM PNPP in 0,1 M sodium citrate, pH 4.5, and 50 /il of 40 raM NaCl at 37**C for 20 min. The 

reaction is stopped by the addition of 0.5 ml of 0.4 M glycine/NaOH, pH 10.4 (Saftig, R et al. (1997) 

J. Biol. Chem, 272:18628-18635). The increase in light absprbance at 410 nm resulting from.the^ 

hydrolysis of PNPP is measured using a spectrophotometer. The increase in light absorbance is 
proportional to the activity of KPP in the assay. 

30 In the alternative, KPP activity is determined by measuring the amoimt of phosphate removed 

from a phosphorylated protein substrate. Reactions are performed with 2 or 4 nM KPP in a final 

volume oif 30 ill containing 60 mM Tris, pH 7.6, 1 mM EDTA, 1 mM BGTA; 0. 1 % 

p<mercaptoethanol and 10 fiM substrate, ^^P-labeled on serine/threonine or tyrosine, as appropriate. 
Reactions are initiated with substrate and incubated at 30^ C for 10-15 min. Reactions are quenched 

35 witii 450 [il of 4% (w/v) activated charcoal in 0.6 M HCl, 90 mM Na4P207, and 2 mM NaH2P04, then 
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centrifuged at 12,000 x ^ for 5 mm. Acid-soluble ^^Pi is quantified by liquid scintillation counting 
(Sinclair, C. et al. (1999) J. Biol. Chem. 274:23666-23672). 

XIX. Kijoase Binding Assay 

Binding of KPP to a FLAG-CD44 cyt fusion protein can be determined by incubating KPP 
5 with anti-KPP-conjugated immunoafTinity beads followed by incubating portions of the beads (having 
10-20 ng of protein) with 0.5 ml of a binding buffer (20 mM Tris-HOL (pH 7.4), 150 mM NaCl, 0.1% 
bovine serum albumin, and 0.05% Triton X-100) in the presence of '"l-labeled FLAG-CD44cyt 
fusion protein (5,000 cpm/ng protein ) at 4 **C for 5 hours. Following binding, beads were washed 
thoroughly in the binding buffer and the bead-bound radioactivity measured in a scintillation counter 
10 (Bourguignon, L.Y.W. et al. (2001) J. Biol. Chem. 276:7327-7336). The amount of incorporated ^^P 
is proportional to the amount of bound KPP. 

XX. Identification of KPP Inhibitors 

Compounds to be tested are arrayed in the wells of a 384-weU plate in varying concentrations 
along with an appropriate buffer and substrate, as described in the assays in Example XVIL KPP 
15 activity is measured for each well and the ability of each compound to inhibit KPP activity can be 
determined, as well as the dose-response kinetics. This assay could also be used to identify molecules 
which enhance KPP activity. 

XXI. Uentification of KPP Substrates 

A KPP '^substrate-trapping'* assay takes advantage of the increased substrate affinity that may 
20 be conferred by certain mutations in the PTP signature sequence of protein tyrosine phosphatases. 
KPP bearing these mutations form a stable con[q>leK with their substrate; this complex may be isolated 
biocheimcally . Site-directed mutagenesis of invariant residues in the PTP signature sequence in a 
clone encoding the catalytic domain of KPP is performed using a method standard in the art or a 
connnercial kit, such as the MUTA-GENE kit from BIO-RAD. For expression of KPP mutants in 
25 Escherichia coli^ DNA fragments containing the mutation are exchanged with the corresponding 
wild-type sequence in an expression vector bearing the sequence encoding KPP or a glutathione 
S-transferase (GST)-KPP fusion protein. KPP mutants are expressed in £. coli and purifie«l by-, - — 
chromatography. 

. The expression vector is transfected into COSl or 293 cells via calcium phosphate-mediated 
30. transfection with 20 /ig of CsCl-purified DNA per 10-^m dish of ceils or 8 iig per 6-cm dish. Forty- 
eight hours after transfection, cells are stimulated with 100 ng/ml epidermal growth factor to increase 
tyrosine phosphorylation in cells, as the tyrosine kinase EGFR is abundant in COS cells. Cells are 
lysed in 50 mM Tris-HCI, pH 7.5/5 mM EDTA/150 mM NaCl/1% Triton X-100/5 mM iodoacetic 
acid/ 10 mM sodium phosphate/ 10 mM NaF/5 /xg/ml leupeptin/S ^g/ml aprotinin/1 mM benzamidine 
35 (1 ml per 10-cm dish, 0.5 ml per 6-cm dish). KPP is immunoprecipitated from lysates with an 
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appropriate antibody. GST-KPP fusion proteins are precipitated with glutathione-Sepharose, 4 fig of 
mAb or 10 ii\ of beads respectively per mg of cell lysate. Complexes can be visualized by PAGE or 
further purified to identify substrate molecules (Flint, AJ. et al. (1997) Proc. Natl. Acad. Sci. USA 
94:1680-1685). 

5 

Various modifications and variations of the described compositions, methods, and systems of 
the invention will be apparent to those skilled in the art without departing from the scope and spirit of 
the invention. It will be appreciated that the invention provides novel and useful proteins, and their 
encoding polynucleotides, which can be used in the drug discovery process, as well as methods for 

10 using these compositions for the detection, diagnosis, and treatment of diseases and conditions. 
Although the invention has been described in connection with certain embodiments, it should be 
understood that the invention as claimed should not be unduly limited to such specific embodiments. 
Nor should the description of such embodiments be considered exhaustive or limit the invention to 
the precise forms disclosed. Furthermore, elements from one embodintent can be readily recombined 

15 with elements from one or more other embodiments. Such combinations can form a number of 
embodiments within the scope of the invention. It is intended that the scope of the invention be 
defined by the following claims and then- equivalents. 
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What is claimed is: 

1. An isolated polypeptide selected from the group consisting of: 

a) a polypeptide comprising an ammo acid sequence selected from the group consisting 
5 ofSEQIDNO:l-5, 

b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% 
identical to an amino acid sequence selected from the group consisting of SEQ ID 
NO: 1-2, 

c) a polypeptide comprising a naturally occurring amino acid sequence at least 97% 

10 identical to the amino acid sequence of SEQ ID NO:S, ) 

d) a biologically active fragment of a polypeptide having an amino acid sequence 
selected from the group consisting of SEQ ID NO: 1-5, and . 

e) an inwiunogenic fragment of a polypeptide having an amino acid sequence selected 
from the group consisting of SEQ ID NO: 1-5. 

15. . 

2. An isolated polypeptide of claim I comprisingj an amino acid sequence selected from the 
group consisting of SEQ ID NO: 1-5. 

3. An isolated polynucleotide encoding a polypeptide of claim 1. 

20 

4. An isolated polynucleotide encoding a polypeptide of claim 2. 

5. An isolated polynucleotide of claim 4 comprising a polynucleotide sequence selected from 
the group consisting of SEQ ID NO:6-10. 

25 

6. A recombinant polynucleotide comprising a promoter sequence operably linked to a 
polynucleotide of claim 3. 
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7. A cell transformed with a recombinant polynucleotide of claim 6. 

8. A transgenic organism comprising a recombinant polynucleotide of claim 6. 



9. A method of producing a polypeptide of claim 1, the method comprising: 
a) culturing a cell under conditions suitable for expression of the polypeptide, wherein 
35 said cell is transformed with a recombinant polynucleotide, and said recombinant 
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polynucleotide comprises a promoter sequence openibly linked to a polynucleotide 
encoding the polypeptide of clum 1, and 
b) recovaing the polypeptide so expressed. 

10. - A method of claim 9. wherein tiie polypeptide comprises an amino acid sequence 
selected from the group consisting of SEQ ID NO: 1-5. 

11. An isolated antibody which specifically binds to a polypeptide of claim 1. 
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12. An isolated polynucleotide selected from the group consisting oft 

a) a polynucleotide comprismg a polynucleotide sequence selected from fh& group 
consisting of SEQ ID NO:6-10, 

b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 
90% identical to a polynucleotide sequence selected from tiie group consisting of 

15 SEQIDNO:6-8, 

c) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 
98% identical to the polynucleotide sequence of SEQ ID NChlO, 

d) a polynucleotide, complementary to a polynucleotide of a), 

e) a polynucleotide complementary to a polynucleotide of b), 

^. ^ *P°'yn"'^>eotide complementary to a polynucleotide of c). and 

g) an RNA equivalent of a)-Q, 

13. An isolated polynucleotide comprismg at least 60 contiguous nucleotides of a 
polynucleotide of claim 12. 



25 
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14. A method of detecting a target polynucleotide in a sample, said target polynucleotide 
having a sequence of a polynucleotide of claim 12. the method comprising: 

a) hybridizing the sample with a probe comprising at least 20 contiguous nucleotides 
comprising a sequence complementary to said target polynucleotide in the sample, 
and which probe specifically hybridizes to said target polynucleotide, under 
conditions whereby a hybridization complex is formed between said probe and said 
taiget-polynucleotidecM-fiagmratsdiereof.and . . . . 

b) detecting the presence or absence of said hybridization complex, and. optionally, if 
present, the amount thereof. 



99 



PF-1688P 

15. A method of claim 14, wherein the probe comprises at least 60 contiguous nucleotides. 

16. A method of detecting a target polynucleotide in a sample, said target polynucleotide 
having a sequence of a polynucleotide of claim 12, the method comprising: 

5 a) amplifying said target polynucleotide or fragment thereof using polymerase chain 

reaction amplification, and 
b) detecting the presence or absence of said amplified target polynucleotide or fragment 
thereof, and, optionally, if present, the amount thereof. 

10 17. A composition comprising a polypeptide of claim 1 and a pharmaceutically acceptable 

excipient 

18. A composition of claim 17, whereui the polypeptide comprises an anuno acid sequence 
selected from the group consisting of SEQ ID NO: 1-5. 

15 

19. A method for treating a disease or condition associated with decreased expression of 
functional KPP, comprising administering to a patient in need of such treatment the conposition of 
claim 17. 

20 20. A method of screening a compound for effectiveness as an agonist of a polypeptide of 

claim 1, the method comprising: 

^ a) contacting a sample comprising a polypeptide of claim 1 with a compound, and 
b) detecting agonist activity in the sanq>le. 

25 21. A composition comprising an agonist compound identified by a method of claim 20 and a 

pharmaceutically acceptable excipient. 

* " " 22. A method for treating a disease or condition associated with decreased expression of 
functional KPP, coinprising administering to a patient in need of such treatment a composition of 
30 claim 21. 

23. A method of screening a conijpouhd for effectiveness as an antagonist of a polypeptide of 
claim 1, the metbod comprising: 

a) contacting a sample comprising a polypeptide of claim 1 with a compound, and 
35 b) detecting antagonist activity in the sample. 
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24. A coinposition comprising an antagonist compound identified by a method of claim 23 
and a phannaceutically acceptable excipient 

25. A metliod for treating a disease or condition associated with overexpression of fimctional 
5 KPP, comprising administering to a patient in need of such treatment a composition of claim 24. 

26. A method of screening for a compound tiiat specificaUy binds to tiie polypeptide of claim 
1. the method comprising: 

a) combining tiie polypeptide of claim 1 witfi at least one test compound under suitable 
10 conditions, and 

b) detecting binding of tiie polypeptide of claim 1 to tiie test compound, tiiereby 
identifying a compound that specifically bmds to flie polypeptide of claim 1. 

27. A method of screening for a compound tfiat modulates tiie activity of tiie polypeptide of 
claim I, the method comprising: 

a) coinbining the polypeptide ofclaiml with at least one test conqjound under 
conditions pennissive for tiie activity of tiie polypeptide of claim 1, 

b) assessing tiie activity of flie polypeptide of claim 1 in tfie presence of tiie test 
compound, and 

c) comparing the activity of the polypeptide of claim 1 in tiie presence of tiie test 
compound witii tiie activity of tiie polypeptide of claim 1 in the absence of tiie test 
con^und. wherem a change in tiie activity of the polypeptide of claim 1 in tiie 
presence of tfie test compound is indicative of a compound fliat modulates tiie activity 
of the polypeptide of claim 1. 

28. A metiiod of screening a compound for effectiveness in altering expression of a target 
polynucleotide, wherein said target polynucleotide comprises a sequence of claim 5. Jie method 

" comprising: 

• a) contacting a sample comprising tiie target polynucleotide witil a compound, under 
conditions suitable for the expression of tfie target polynucleotide, 
b) detectmg altered expression of the taiget polynucleotide, and 

comparing tiie expression of tiie target polynucleotide in tiie presMice of varying 
amounts of die compound and in tiie absence of the compound. 

29. A method of screening for potential toxicity of a test compound, tiie metiiod comprising: 
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a) treating a biological sample containing nucleic acids with the test compound, 

b) hybridizing the nucleic acids of the treated biological sample with a probe comprising 
at least 20 contiguous nucleotides of a polynucleotide pf claim 12 under conditions 
whereby a specific hybridization conqilex is formed between said probe and a target 
polynucleotide in the biological sample, said targpt polynucleotide comprising a 
polynucleotide sequence of a polynucleotide of claim 12 or firagnaent thereof, 

c) quantising the amount of hybridization complex, and 

d) comparing the amount of hybridization complex in the treated biological sample with 
the amount of hybridization complex iiv an untreated biological sample, wherein a 
difference in the amount of hybridization complex in the treated biological sarnple 
indicates potential toxici^ of the test conqiound. 

30. A method for a diagnostic test for a condition or disease associated with ttie expression 

of KPP in a biological sample, the method conq^rising: 
15 a) combining tfie biological sample with an antibody of chum 11, under conditions 

suitable for tiie antibody to bind the polypeptide and form an antibodyrpolypeptide 

complex, and 

b) detecting tiie complex, wherein the presence of the complex correlates witi» tiie 
presence of the polypeptide in the Mological sanq>le. 

20 

31. Hie antibody of claim 11, wherein the antibody, is: 

a) a chimeric antibody, 

b) a single chain antibody. 

c) a Fab fragment, 

25 d) a F(ab')2 fragment, or 

e) a humanized antibody. 

32. A composition comprising an antibody of claim 11 and an acceptable excrpient. 
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33. A method of diagnosing a condition or disease associated wiU» the expression of KPP in 
a subject, comprising administering to said subject an effective amount of tf>e composition of claim 

.32. . . • - 

34. A composition of claim 32, further comprising a label. 

35. A metiiod of diagnosing a condition or disease associated witfi the expression of KPP in 
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a subject, comprising administering to said subject an effective amount of the composition of claim 
34. 

36. A method of preparing a polyclonal antibody with the specificity of the antibody of claim 

5 1 1, the method comprising: 

a) immunizing an animal with a polypeptide consisting of an amino acid sequence 
selected fiom the group consisting of SBQ ID NO: 1-5. or an immunogenic fragment 
thereof, under conditions to elicit an antibody response, 

b) isolating antibodies from tfie animal, and 

10 c) screenhig the isolated antibodies with the polypeptide, thsaeby identifying a 

polyclonal antibody which specifically binds to a polypeptide comprisuig an amino 
acid sequoice selected from the group consisting of SEQ ID NO:l-5. 
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37: A polyclonal antibody produced by a method of clmm 36. 

38. A composition comprising the polyclonal antibody of claim 37 and a suitable carrier. 



39. A mefliod of making a monoclonal antibody witfi the specificity of tfie antibody of claim 

11, the method comprising: 
20 a) immunizing an animal witii a polypeptide consisting of an amino acid sequence 

selected fiom the group consisting of SEQ ID NO:l-5, or an immunogenic fragment 
thereof, under conditions to elicit an antibody response, 

b) isolating antibody producing cells from the animal, 

c) fusing the antibody producing cells widi immortalized cells to form monoclonal 
25 antibody-producing hybridoma cells, 

d) cultaring the hybridoma cells, and 

e) ' isolating from the culture monoclonal antibody which specifically binds to 

polypeptide comprising anamino acid sequence selected from the ^up consisting of 
SEQ ID NO: 1-5. 

30 

40. A monoclonal antibody produced by a method of claim 39. 

41. A composition comprising the monoclonal antibody of claim 40 and a suitable carrier. 

35 42. The antibody of claim 11, wherein tiie antibody is produced by screening a Fab 

expression library. , 



103 



PP-1688 P ^ 

43. The antibody of claim 11. wherein the antibody is produced by screening a recombinant 
immunoglobulin library. 

44. A method of detectmg a polypeptide comprising an amino acid sequence selected ftom 
5 the group consisUng of SEQ ID NO: 1-5 in a sample, the method comprising: 

a) incubating the antibody bf claim 1 1 with the sample under conditions to aUow 
specific binding of the antibody and the polypeptide, and 

b) detecting specific binding, wherein specific binding indicates the presence of a 
polypeptide comprising an amino acid sequence selected from the group consisting of 

*0 SEQ ID NO: 1-5 in the sample. 

45. A method of purifying a polypeptide comprising an amino acid sequence selected from 
the group consisting of SEQ ID NO:l-5 from a sample, the method comprising: 

a) incubating the antibody of claim 1 1 with the sample under conditions to allow 
specific binding of the antibody and the polypeptide, and 

b) separating the antibody from the sample and obtaining the purified polypeptide 
comprising an amino acid sequence selected from the group consisting of SEQ ID 
NO:l-5. 
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46. A microarray wherein at least one element of the microarray is a polynucleotide of claim 

13. 



47. A method of generatuig an expression profile of a sample which contains 
polynucleotides, the method conq)rising: 

25 a) labeling the polynucleotides of tile sample, 

b) contacting tiie elements of tbs microarray of claim 46 with tfie labeled 
polynucleotides of tiie sample under conditions suitable for the formation of a 

r - hybridization conq>lex, and ' ■ ■. —---r •. r--. — - 

c) quantifying tiie expression of tiie polynucleotides in tiie sanq>le. 

30 

48. An array comprising different nudeotide'molecules affixed in distinct physical location^ ' 
on a solid substrate, wherejaat least one of said nucleotide molecules comprises a first 
oligonucleotide or polynucleotide sequence specifically hybridizable witfi at least 30 contlguoys 
nucleotides of a target polynucleotide, and wherein said target polynucleotide is a polynucleotide of 

35 claim 12. 
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49. An array of claim 48, wherein said first oligonucleotide or polynucleotide sequence is 
completely conotplementary to at least 30 contiguous nucleotides of said target polynucleotide. 

50. An array of claim 48, wherein said first oligonucleotide or polynucleotide sequence is 
completely complementary to at least 60 contiguous nucleotides of said target polynucleotide. 

51. An array of claim 48, wherein said first oligonucleotide or polynucleotide sequence is 
completely complementary to said target polynucleotide. 

52. An array of claim 48, which is a microarray. 

53. An array of claim 48, further comprising said target polynucleotide hybridized to a 
nucleotide molecule comprising said first oligonucleotide or polynucleotide sequence. 

54. An array of claim 48, wherein a linker joins at least one of said nucleotide molecules to 
said solid substrate. 

55. An array of claim 48. wherem each distinct physical location on the substrate contains 
multiple nucleotide molecules, and the multiple nucleotide molecules at any single distinct physical 
location have the same sequence, and each distinct physical location on the substrate contains 
nucleotide molecules having a sequence which differs firom the sequence of nucleotide molecules at 
another distinct physical location on the substrate. 

56. A polypeptide of claim 1, coniprising the amino acid sequence of SEQ ID NO:l. 

57. A polypeptide of claim 1, con^rising the ammo acid sequence of SEQ ID NO:2, 

58. A polypeptide of claim 1, coniprising the antiitid acid sequence of SEQ ffi^^ ' 

• 59. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:4. 

60. A polypeptide.of claim 1, comprising the amino acid sequence of SEQ ID NO:5. 
. 61. A polynucleotide of claim 12. comprising the polynucleotide sequence of SEQ ID NO:6. 
62. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:7. 
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63. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID NO:8- 

A 

64. A polynucleotide of claim 12. comprising the polynucleotide sequence of SEQ ID NO:9. 

65. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO: 10. 
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ABSTRACT OF THE DISCLOSURE 

Various embodiments of the invention provide human kinases and phosphatases (KPP) and 
polynucleotides which identify and encode KPP. Embodiments of the invention also provide 
5 expression vectors, host cells, antibodies, agonists, and antagonists. Other embodiments provide 
methods for diagnosing, tr^ting, or preventing disorders associated with aberrant expression of KPP. 
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<120> KINASES AND PHOSPHATASES 

<130> PP-1688 P 

<140> To Be Assigned 
<141> Herewith 

<xeo> 10 

<170> PERL Program 

<210> 1 

<211> 1125 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 2509577CD1 

Met Pro ASP Gin Asp Lys Lys Val Lys Thr Thr Glu Lys Ser Thr 
15 10 X5 

Asp Lys Gin Gin Glu lie Thr He Arg Asp Tyr Ser Asp Leu Lys 
20 25 30 

Arg Leu Arg Cys Leu Leu Asn Val Gin Ser Ser Lys Gin Gin Leu 
35 40 45 

Pro Ala He Asn Phe Asp Ser Ala Gin Asn Ser Met Thr Lys Ser 

50 55 
Glu Pro Ala He Arg Ala Gly Gly His Arg Ala Arg Gly Gin Trp 
65 70 75 

His Glu Ser Thr Glu Ala Val Glu Leu Glu ^Asn Phe Ser He Asn 
80 85 90 

Tyr Lys Asn Glu Arg Asn Phe Ser Lys His Pro Gin Arg Lys Leu 
95 100 105 

Phe Gin Glu He Phe Thr Ala Leu Val, Lys..As.n Arg ieu He^ Sei^ 
^ : . 110 115 120 

Arg Glu Trp val Asn Arg Ala Pro Ser He His Phe Leu Arg Val 
125 130 135 

Leu He cys Leu Arg Leu Leu Met Arg Asp Pro Cys Tyr Gin Glu 
140 ' 145 150 

He Leu His Ser Leu Gly Gly He Glu Asn Leu Ala Gin Tyr Met 
155 . 1.60 165 

Glu He val Ala Asn Glu Tyr Leu Gly Tyr Gly Glu Glu Gin Hxs 
170 175 180 

Thr Val Asp Lys Leu Val Asn M^t Thr Tyr He Phe Gin Lys Leu 
185 190 195 

Ala Ala Val Lys Asp Gin Arg Glu Trp Val Thr Thr Ser Gly Ala 
200 205 210 

His Lys Thr Leu Val Asn Leu Leu Gly Ala Arg Asp Thr Asn Val 
215 220 225 
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I^u Leu Gly Ser Leu Leu Ala Leu Ala Ser Leu Ala Glu Ser Gin 
230 23S 240 

Glu Cys Arg Glu Lys lie Ser Glu Leu Asn He Val Glu Asn Leu 
245 250 255 

LSu Met He Leu His Glu Tyu Asp Leu Leu Ser Lys Arg Leu Thr 
260 265 270 

Ala Glu Leu Leu Arg Leu Leu Cys Ala Glu Pro Gin Val Lys Glu 
275 280 285 

Gin val Lys Leu Tyr Glu Gly He Pro Val Leu Leu Ser Leu Leu 
290 295 300 

Hxs ser Asp His Leu Lys Leu Leu Trp Ser He Val Trp He Leu 
305 310 215 

Val Gin Val Cys Glu Asp Pro Glu Thr Ser Val Glu He Arg He 
320 325 330 

Trp Gly Gly He Lys Gin Leu Leu His He Leu Gin Gly Asp Arg 

335 340 
Asn Phe val Ser Asp His Ser Ser He Gly Ser Leu Ser Ser Ala 
350 355 360 

Asn Ala Ala Gly Arg He Gin Gin Leu His Leu Ser Glu Asp Leu 

365 370 
Ser Pro Arg Glu He Gin Glu Asn Thr Phe Ser Leu Gin Ala Ala 
380 385 390 

cys Cys Ala Ala Leu Thr Glu Leu Val Leu Asn Asp Thr Asn Ala 
395 400 405 

His Gin Val Val Gin Glu Asn Gly Val Tyr Thr He Ala Lys Leu 
410 415 420 

lie Leu Pro Asn Lys Gin Lys Asn Ala Ala Lys Ser Asn Leu Leu " 

425 430 
Gin Cys Tyr Ala Phe Arg Ala Leu Arg Phe Leu Phe Ser Met Glu 
*40 445 450 

Arg Asn Arg Pro Leu Phe Lys Arg Leu Phe Pro Thr Asp Leu Phe 
455 460 465 

Glu He Phe He Asp He Gly His Tyr Val Arg Asp He ser Ala 
^■'O 475 4gQ 

Tyr Glu Glu Leu Val Ser Lys Leu Asn Leu Leu Val Glu Asp Glu 
485 490 495 

Leu Lys Gin He Ala Glu Asn He Glu Ser He Asn Gin Asn Lys 
„ 505 510 

Ala Pro Leu Lys Tyr He Gly Asn Tyr Ala He Leu Asp His Leu 
515 520 525 

Gly ser Gly Ala Phe Gly Cys Val Tyr Lys Val Arg Lys His Ser 
530 535 54Q 

Gly Gin Asn Leu Leu Ala Met Lys Glu Val Asn Leu His Asn Pro 

545 550 555 

- Ala Phe Gly Lys Asp Lys Lys Asp Arg Asp Ser Ser Val Arg Asn 
560 565 570 

He val ser Glu Leu Thr He He Lys Glu Gin Leu Tyr His Pro 
575 580 535 

Asn He Val Arg Tyr Tyr Lys Thr Phe Leu Glu Asn Asp Arg Leu 

590 595 
^^Jle val Met Glu .Leu. lie Glu Gly Ala Pro Leu Gly Glu His 

„ 605 610 615 

Phe Ser Ser Leu Lys Glu Lys His His His Phe Thr Glu Glu Arg 
T m r 625 630 

Leu Trp Lys He Phe He Gin Leu Cyfe Leu Ala Leu Arg Tyr Leu ' 

635 640 645 

His Lys Glu Lys Arg He Val His Arg Asp Leu Thr Pro Asn Asn 

650 655 660 

lie Met Leu Gly Asp Lys Asp Lys Val Thr Val Thr Asp Phe Gly 
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665 670 675 

Leu Ala Lys Gin Lys Gin Glu Asn Ser Lys Leu Thr Ser Val Val 

680 685 690 

Gly Thr He l*eu Tyr Ser Cys Pro Glu Val Leu Lys Ser Glu Pro 

695 700 705 

Tyr Gly Glu Lys Ala Asp Val Trp Ala Val Gly Cys He Leu Tyr 

710 715 720 

Gin Met Ala Thr Leu Ser Pro Pro Phe Tyr Ser Thr Asn Met Leu 

725 730 735 

Ser Leu Ala Thr Lys He Val Glu Ala Val Tyr Glu Pro Val Pro 

740 745 750 

Glu Gly He Tyr Ser Glu Lys Val Thr Asp Thr He Ser Arg Cys 

755 760 765 

Leu Thr Pro Asp Ala Glu Ala Arg Pro Asp He Val Glu Val Ser 

770 775 780 

Ser Met He Ser Asp Val Met Met Lys Tyr Leu Asp hsn Leu Ser 

785 790 795 

Thr Ser Gin Leu Ser Leu Glu Lys Lys Leu Glu Arg Glu Arg Arg 

800 805 810 

Arg Thr Gin Arg Tyr Phe Met Glu Ala Asn Arg Asn Thr Val Thr 

815 820 825 

Cys His His Glu Leu Ala Val Leu Ser His Glu Thr Phe Glii Lys 

830 835 840 

Ala Ser Leu Ser Ser Ser Ser Ser Gly Ala Ala Ser Leu Lys Ser 

845 850 855 

Glu Leu Ser Glu Ser Ala Asp Leu Pro Pro Glu Gly Phe Gin Ala 

860 865 870 

Ser Tyr Gly Lys Asp Glu Asp Arg Ala Cys Asp Glu He Leu Ser 

875 880 885 

Asp Asp Asn Phe Asn Leu Glu Asn Ala Glu Lys Asp Tlir Tyr Ser 

890 895 900 

Glu Val Asp Asp Glu Leu Asp He Ser Asp Asn Ser Ser Ser Ser 

905 910 915 

Ser Ser Ser Pro Leu Lys Glu Ser Thr Phe Asn He Leu Lys T^g 

920 925 930 

Ser Phe Ser Ala Ser . Gly Gly Glu Arg Gin Ser Gin Thr Arg Asp 

935 940 945 

Phe Thr Gly Gly Thr Gly Ser Arg Pro Arg Pro Gly Pro Gin Met 

950 955 960 

Gly Thr Phe Leu Trp Gin Ala Ser Ala Gly He Ala Val Ser Gin 

965 970 975 

Arg Lys Val Arg Gin He Ser Asp- Pro He Gin Gin He Leu He 

980 985 990 

Gin Leu His Lys He He Tyr He Thr Gin Leu. Pro.. Pro. Ala Leu- ^ - - 

995 1000 1005 

His His Asn Leu Lys Arg Arg Val He Glu Arg Phe Lys Lys Ser 
1010 1015 1020 

Leu Phe Ser Gin Gin Ser Asn Pro Cys Asn Leu Lys Ser Glu He 

1025 1030 1035 . 

Lys Lys Leu Ser Gin Gly Ser Pro Glu Pro He Glu Pro Asn Phe 

- • -1040 1045 , .1050 

Phe Thr Ala Asp Tyr His Leu Leu His Arg Ser Ser Gly Gly Asn 
1055 1060 1065 

Ser Leu Ser Pro Asn Asp Pro Thr Gly Leu Pro Thr Ser He Glu 
^ 1070 1075 1080 

Leu Glu Glu Gly He Thr Tyr Glu Gin Met Gin Thr Val He Glu 
1085 1090 1095 

Glu Val Leu Glu Glu Ser Gly Tyr Tyr Asn Phe Thr Ser Asn Arg 
1100 1105 1110 
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Tyr His Ser Tyr Pro Trp Gly Thr Lys Asn His Pro Thr I*ys Arg 
H15 1120 1125 



<210> 2 

<211> 888 

<212> PRT 

<213> Homo sapiens 



<220> 

<221> misc_feature 

<223> Incyte ID No: 7505222CD1 



<400> 2 

Met Gin lie Val Gly Ser Pro Gly Pro Gly Ala Ala Trp Pro Val 
15 10 15 

Lys Arg Val Val Phe Pro Asn Gly Glu Gin Phe Leu Leu ser Val 
20 25 30 

Ala Tlir Lys Lys Val lie Cys Leu Cys Leu Gly Lys Ala Gly Arg 
35 40 45 

Lys Val Leu Ala Lys Lys Leu Ser Pro Leu Glu Thr Met Asp Lys 
50 55 60 

Tyr Asp Vai lie Lys Ala He Gly Gin Gly Ala Phe Gly Lys Ala 
65 70 75 

Tyr Leu Ala Lys Gly Lys Ser Asp Ser Lys His Cys Val He Lys 
80 815 90- 

Glu He Asn Phe Glu Lys Met Pro He Gin Glu Lys Glu Ala Ser 
95 100 105 

Lys Lys Glu Val He Leu Leu Glu Lys Met Lys His Pro Asn He 
110 115 120 

Val Ala Phe Phe Asn Ser Phe Gin Glu Asn Gly Arg Leu Phe He 
125 130 135 

val Met Glu Tyr Cys Asp Gly Gly Asp Leu Met Lys Arg He Asn 
140 145 150 

Arg Gin Arg Gly Val Leu Phe Ser Glu Asp Gin He Leu Gly Trp 
155 160 165 

Phe Val Glti He Ser Leu Gly Leu Lys His He His Asp Arg Lys 
170 175 180 

He Leu His Arg Asp He Lys Ala Gin Asn He Phe Leu Ser Lys 
185 190 195 

Asn Gly Met Val Ala Lys Leu Gly Asp Phe Gly He Ala Arg Val 
200 205 210 

Leu Asn Asn Ser Met Glu Leu Ala Arg Thr Cys He Gly Thr Pro 
215 . . 220 - - 2-25- 

'Tyr Tyr Leu Ser Pro Glu He Cys Gin Asn Lys Pro Tyr Asn Asn 
230 235 240 

Lys Thr Asp He Trp Ser Leu Gly Cys Val Leu Tyr Glu Leu Cys 
245 250 255 

Thr Leu Lys His Pro Phe Glu Gly Asn Asn Leu Gin Gin Leu Val 
260 265 270 

Leu Lys He Cys Gin. Ala His .Phe Ala ?ro lie ^er PV9 Gly ?h.e 
275 280 285 

Ser Arg Glu Leu His Ser Leu He Ser Gin Leu Phe Gin Val Ser 
290 295 300 

Pro Arg Asp Arg Pro Ser He Asn Ser He Leu Lys Arg Pro Phe 
305 310 315 

Leu Glu Asn Leu He Pro Lys Tyr Leu Thr Pro Glu Val He Gin 
320 325 330 

Glu Glu Phe Ser His Met Leu He Cys Arg Ala Gly Ala Pro Ala 
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335 340 345 

Ser Arg Hie Ala Gly Lys Val Val Gin Lys Cys Lys lie G,ln Lys 

350 355 360 

Val Arg Phe Gin Gly Lys Cys Pro Pro Arg Ser Arg lie Ser Val 

365 370 375 

Pro lie Lys Arg Asn Ala Xle Leu His Arg Asn Glu Trp Arg Pro 

380 385 390 

Pro Ala Gly Ala Gin Lys Ala Arg Ser Xle Lys Met Xle Glu Arg 

395 400 405 

Pro Lys lie Ala Ala Val Cys Gly His Tyr Asp Tyr Tyr Tyr Ala 

410 415 420 

Gin Leu Asp Met Leu Arg Arg Arg Ala His Lys Pro Ser Tyr His 

425 430 435 

Pro lie Pro Gin Glu Asn Thr Gly Val Glu Asp Tyr Gly Gin Glu 

440 445 450 

Thr Arg His Gly Pro Ser Pro Ser Gin Trp Pro Ala Glu Tyr Leu 

455 460 465 

Gin Arg Lys Phe Glu Ala Gin Gin Tyr Lys Leu Lys Val Glu Lys 

470 475 480 

Gin Leu Gly Leu Arg Pro Ser Ser Ala Glu Pro Asn Ty^^ Asn Gin 

485 490 • 495 

Arg Gin Glu Leu Arg Ser Asn Gly Glu Glu Pro Arg Phe Gin Glu 

500 505 510 

Leu Pro Phe Arg Lys Asn Glu Met Lys Glu Gin Glu Tyr Trp Lys 

515 520 525 

Gin Leu Glu Glu Xle Arg Gin Gin Tyr His Asn Asp Met Lys Glu 

530 535 540 

Xle Arg Lys Lys Met Gly Arg Glu Pro Glu Glu Asn Ser Lys Xle 

545 550 555 

Ser His Lys Thr Tyr Leu Val Lys Lys Ser Asn Leu Pro Val His 

560 565 570 

Gin Asp Ala Ser Glu Gly Glu Ala Pro Val Gin Asp Xle Glu Lys 

575 580 585 

Asp Leu Lys Gin Met Arg Leu Gin Asn Thr Lys Glu Ser Lys Asn 

590 595 600 

Pro Glu Gin Lys Tyr Lys Ala Lys Gly Val Lys Phe Glu lie Asn 

605 610 615 

Leu Asp Lys Cys Xle Ser Asp Glu Asn Xle Leu Gin Glu Glu Glu 

620 625 630 

Ala Met Asp Xle Pro Asn Glu Thr Leu Thr Phe Glu Asp Gly Met 

635 640 645 

Lys Phe Lys Glu Tyr Glu Cys Val Lys Glu His Gly Asp Tyr Thr 

650 655 660 

Asp Lys Ala Phe Glu Lys Leu His Cys .Pro GXu.Ala.-Gly Phe- Ser 

665 670 675 

Thr Qln Thr Val Ala Ala Val Gly Asn Arg Arg Gin Trp Asp Gly 

680 685 690 

Gly Ala Pro Gin .Thr Leu Leu Gin Met .Met Ala Val Ala Asp He 

695 700 705 

Thr Ser Thr Cys Pro Thr Gly Pro Asp Asn Gly Gin Val lie Val 

710 . 715 . 720 

Xle Glu Gly Xle Pro Gly Asn Arg Lys Gin Trp Arg His Glu Ala 

725 730 735 

Pro Gly Thr Leu Met Ser Val Leu Ala Ala Ala His Leu Thr Ser 

740 74'5 750 

Ser Ser Phe Ser Ala Asp Glu Glu Phe Ala Met Gly Thr Leu Lys 

755 760 765 

Gin Trp Leu Pro Lys Glu Glu Asp Glu Gly Lys Val Glu Met Val 

770 775 780 
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Ser Gly He Glu Val Asp Glu Glu Gin Leu Glu Pro Arg Ser Asp 

785 790 795 

Asp Asp Asp Thr Asn Phe Glu Glu Ser Glu Asp Glu Leu Arg Asp 

800 805 810 

.Glu Val Val Glu Tyr Leu Glu Lys Leu Ala Thr Phe Lys Gly Glu 

815 820 825 

Glu Lys Thr Glu Glu Ala Ser Ser Thr Ser Lys Asp Ser Arg Lys 

830 835 840 

Ser Arg Glu Arg Glu Gly He Ser Met Gin Lys Ser Glu Glu Leu 

845 850 855 

Arg Glu Gly Leu Glu Asn He Ser Thr Thr Ser Asn Asp His He 

860 865 870 

Cys He Thr Asp Glu Asp Gin Gly Thr Ser Thr Thr Ser Gin Asn 

875 880 885 

He Gin Val 



<210> 3 

<211> 487 

<212> PRT 

<213> Homo sapiens 

<220> 

<22X> misc_feature 

<223> Incyte ID No: 7524408CD1 

<400> 3 

Met Gly Arg He Gly He Ser Cys Leu Phe Pro Ala Ser Trp His 
15 10 15 

Phe Ser He Ser Pro Val Gly Cys Pro Arg He Leu Asn Thr Asn 
20 25 30 

Leu Arg Gin He Met Val He Ser Val Leu Ala- Ala Ala Val Ser 

35 - 40 45 * 

Leu Leu Tyr Phe Ser Val Val He He Arg Asn Lys Tyr Gly Arg 
50 55 60 

Leu Thr Arg Asp Lys Lys Phe Gin Arg Tyr Leu Ala Arg Val Thr 
65 70 75 

Asp He Glu Ala Thr Asp Thr Asn Asn Pro Asn Val Asn Tyr Gly 
80 85 90 

He Val Val Asp Cys Gly Ser Ser Gly Ser Arg Val Phe Val Tyr 
95 100 105 

Cys Trp Pro Arg His Asn Gly Asn Pro His Asp Leu Leu Asp He 
110 115 120 

Arg Gin Met Arg Asp Lys Asn Arg. Lys Pro. Val.. Val Met Lys He ^--rrr-. - 

125 130 135 

Lys Pro Gly He Ser Glu Phe Ala Thr Ser Pro Glu Lys Val Ser 
140 '145 150 

Asp Tyr He Ser Pro Leu Leu Asn Phe Ala Ala Glu His Val Pro 
155 160 165 

Arg Ala Lys His Lys Glu Thr Pro Leu Tyr He Leu Cys Thr Ala 

170 175 180 

Gly Met Arg He Leu Pro Glu Ser Gin Gin Lys Ala He Leu Glu 
185 190 195 

Asp Leu Leu Thr Asp He Pro Val His Phe Asp Phe Leu Phe Ser 
200 205 210 

Asp Ser His Ala Glu Val He Ser Gly Lys Gin Glu Gly Val Tyr 
215 220 225 

Ala Trp He Gly He Asn Phe Val Leu Gly Arg Phe Glu His He 
230 235 240 
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61u Asp Asp Asp Glu Ala Val Val Glu Val Asn lie Pro Gly Ser 
245 250 255 

val Ser ser Glu Ala lie Val Arg Lys Arg Thr Ala Gly lie Leu 
260 265 270 

Asp Met Gly Gly Val Leu Thr Gin lie Ala Tyr Glu Val Pro Lys 
_ , 280 285 

Thr Ala Ser Phe Ala Ser Ser Gin Gin Glu Glu Val Ala Lys Asn 
290 295 300 

Leu Leu Ala Glu Phe Asn Leu Gly Cys Asp Val His Gin Thr Glu 

305 310 
Has Val Tyr Arg Val Tyr Val Ala Thr Phe Phe Gly Phe Gly Gly 
320 325 330 

Asn Ala Ala Arg Gin Arg Tyr Glu Asp Arg lie Phe Ala Asn Thr 
335 340 345 

He Gin Lys Asn Arg Leu Leu Gly Lys Gin Thr Gly Leu Thr Pro 

350 355 
Asp Met Pro Tyr Leu Asp Pro Cys Leu Pro Leu Asp He Lys Asp 

I 365 370 

Glu He Gin Gin Asn Gly Gin Thr He Tyr Leu Arg Gly Thr Gly 
380 385 390 

Asp Phe Asp Leu Cys Arg Glu Thr He Gin Pro Phe Met Asn Lys 
395 400 405 

Thr Asn Glu Thr Gin Thr Ser Leu Asn Gly Val l^r Gin Pro Pro 
■ . 415 420 

He Hxs Phe Gin Asn Ser Glu Mra Tyr Gly Phe Ser Glu Phe Tyr 
_ 425 430 435 

Tyr cys Thr Glu Asp Val Leu Arg Met Gly Gly Asp Tyr Asn Ala 

440 445 
Ala Lys Phe Thr Lys Ala Ala Lys Asp Tyr Cys Ala Thr Lys Trp 
455 460 465 

Ser He Leu Arg Glu Arg Phe Asp Arg Gly Leu Tyr Ala Ser His 

475 480 

Ala Asp Leu His Arg Leu Lys 
485 

<210> 4 

<211> 1309 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> inisc_featiire 

<223> llncyte ID Mo: 7526163CD1 

<400> 4 • .' ■ - 

Mtet Asp Glu Ser Ser Leu Leu Arg Arg Arg Gly Leu Gin Lys Glu 

T « ' ^ 10 15 

Leu Ser Leu Pro Arg Arg Gly Arg Gly Cys Arg Ser Gly Asn Arg 

20 25 30 

Lys Ser Leu Val Val Gly Thr Pro Ser Pro Thr Leu Ser Arg Pro 

_ - . 3S 40 ... 45 

Leu Ser Pro Leu Ser Val Pro Thr Ala .Gly Ser Ser Pro Leu Asp 

50 55 go 

Ser Pro Arg Asn Phe Ser Ala Ala Ser Ala Leu Asn Phe Pro Phe 

65 70 75 

Ala Arg Arg Ala Asp Gly Arg Arg Trp Ser Leu Ala Ser Leu Pro 

80 85 90 

ser Ser Gly Tyr Gly Thr Asn Thr Pro Ser Ser Thr Leu Ser Ser 

95 100 105 
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Ser Ser Ser Ser Arg Glu Arg Leu His Gin Leu Pro Phe Gin Pro 
"0 115 ' 120 

Thr Pro Asp Glu Leu His Phe Leu Ser Lys His Phe Arg Ser Ser 
125 130 135 

Glu Asn Val Leu Asp Glu Glu Gly Gly Arg Ser Pro Arg Leu Arg 
140 145 ISO 

Pro Arg Ser Arg Ser Leu Ser Pro Gly Arg Ala Thr Gly Thr Phe 
155 160 165 

Asp Asn Glu lie Val Met Met Asn His Val Tyr Arg Glu Arg Phe 
"0 175 180 

Pro Lys Ala Thr Ala Gin Met Glu Gly Arg Leu Gin Glu Phe Leu 
185 190 195 

Thr Ala Tyr Ala Pro Gly Ala Arg Leu Ala Leu Ala Asp Gly Val 
200 205 210 

Leu Gly Phe He His His Gin He Val Glu Leu Ala Arg Asp Cys 
215 220 225 

Leu Ala Lys Ser Gly Glu Asn Leu Val Thr Ser Arg Tyr Phe Leu 
230 235 240 

Glu Met Gin Glu Lys Leu Glu Arg Leu Leu Gin Asp Ala His Glu 
245 250 255 

Arg Ser Asp Ser Glu Glu Val Ser Phe He Val Gin Leu Val Arg 
260 265 270 

Lys Leu Leu He He He Ser Arg Pro Ala Arg Leu Leu Glu Cys 
275 280 285 

Leu Glu Phe Asp Pro Glu Glu Phe Tyr His Leu Leu Glu Ala Ala 
290 295 300 

Glu Gly His Ala Arg Glu Gly Gin Gly He Lys Thr Asp Leu Pro 
305 310 315 

Gin Tyr He He Gly Gin Leu Gly Leu Ala Lys Asp Pro Leu Glu 
320 325 330 

Glu Met val Pro Leu Ser His Leu Glu Glu Glu Gin Pro Pro Ala 
335 340 345 

Pro Glu Ser Pro Glu Ser Arg Ala Leu Val Gly Gin. Ser Arg Arg 

350 355 
Lys Pro cys Glu Ser A^p Phe Glu Thr He Lys Leu He Ser Asn 
365 370 375 

Gly Ala Tyr Gly Ala Val Tyr Leu Val Arg His Arg Asp Thr Arg 
380 385 390 

Gin Arg Phe Ala He Lys Lys He Asn Lys Gin Asn Leu He Leu 
395 400 405 

Arg Asn Gin Val Gin Gin Val Phe Val Glu Arg Asp He Leu Thr 
• 410 415 420 

Phe Ala Glu Asn Pro Phe Val Val Ser Met Phe Cys Ser Phe Glu 

- 425 430 . . - .4-35 

Thr Arg Arg His Leu Cys Met Val Met Glu Tyr Val Glu Gly Gly 
440* 445 450 

Asp Cys Ala Thr Leu Leu Lys Asn Met Gly Pro Leu Pro Val Asp 
455 460 465 

Met Ala Arg Leu Tyr Phe Ala Glu Thr Val Leu Ala Leu Glu Tyr 
470 475 480 

Leu His. Asn Tyr Qly He Vai His .Arg Asp Leu.I^r^ Prp A5P Asn 

485 490 495 ' " 

Leu Leu He Thr Ser Leu Gly His He Lys Leu Thr Asp Phe Gly 
500 505 510 

Leu Ser Lys He Gly Leu Met Ser Met Ala Thr Asn Leu Tyr Glu 
515 520 525 

Gly His He Glu Lys Asp Ala Arg Glu Phe He Asp Lys Gin Val 
530 535 540 

Cys Gly Thr Pro Glu Tyr He Ala Pro Glu Val He Phe Arg Gin 
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545 550 
Gly Tyr Gly bys Pro Val Aap Trp Trp Ala Met Gly Val Val Leu 

560 565 
Tyr Glu Phe Leu Val Gly Cys Val Pro Phe Phe Gly. Asp Thr Pro 
575 580 595 

Glu Glu Leu Phe Gly Gin Val Val Ser Asp Glu lie Met Trp Pro 
^, 59S 600 

Glu Gly Asp Glu Ala Leu Pro Ala Asp Ala Gin Asp Leu lie Thr 
605 610 615 

Arg Leu Leu Arg Gin Ser Pro Leu Asp Arg Leu Gly Thr Gly Gly 
620 , 625 630 

Thr His Glu Val Lys Gin His Pro Phe Phe Leu Ala Leu Asp Trp 
635 640 645 

Ala Gly Leu Leu Arg His Lys Ala Glu Phe Val Pro Gin Leu Glu 
650 655 660 

Ala Glu Asp Asp Thr Ser Tyr Phe Asp Thr Arg Ser Glu Arg Tyr 
665 670 675 

Arg Hxs Leu Gly Ser Glu Asp Asp Glu Thr Asn Asp Glu Glu Ser 
680 685 690 

Ser Thr Glu lie Pro Gin Phe Ser Ser Cys Ser His Arg Phe Ser 
695 700 705 

Lys val Tyr Ser Ser Ser Glu Phe Leu Ala Val Gin Pro Thr Pro 
"^^^ 715 720 

Thr Phe Ala Glu Arg Ser Phe Ser Glu Asp Arg Glu Glu Gly Trp 
725 730 735 

Glu Arg Ser Glu Val Asp Tyr Gly Arg Arg Leu Ser Ala Asp lie 
■^40 745 750 

Arg Leu Arg Ser Trp Thr Ser Ser Gly ser Ser Cys Gin Ser Ser 
755 760 . 765 

Ser Ser Gin Pro Glu Arg Gly Pro Ser Pro Ser Leu Leu Asn Thr 
^, „ • ■'■'0 775 . 780 

lie Ser. Leu Asp Thr Met Pro Lys Phe Ala Phe Ser Ser Glu Asp 
785 790 795 

Glu Gly Val Gly Pro Gly Pro Ala Gly Pro Lys Arg Pro 'Val Phe 
800 805 810 

He Leu Gly Glu Pro Asp Pro Pro Pro Ala Ala Thr Pro Val Mtet 
315 820 825 

Pro Lys Pro Ser Ser Leu Ser Ala Asp Thr Ala Ala Leu Ser His 

»i , "° 835 840 

Ala Arg Leu Arg Ser Asn Ser He Gly Ala Arg His Ser Thr Pro 

845 850 855 

Arg Pro Leu Asp Ala Gly Arg Gly Arg Arg Leu Gly Gly Pro Arg 

860 865 870 

Asp Pro Ala Pro Glu Lys Ser Arg Ala Ser Ser Ser Gly Gly-Ser..- . 

875 880 885 

Gly Gly Gly Ser Gly Gly Arg Val Pro Lys Ser Ala Ser Val Ser 
r 895 900 

Ala Leu Ser Leu He He Thr Ala Asp Asp Gly Ser Gly Gly Pro 

905 910 915 

Leu Met Ser Pro Leu Ser Pro Arg Ser Leu Ser Ser Asn Pro Ser 

„ V -^25. 930 

ser Arg Asp Ser Ser Pro Ser Arg Asp Pro Ser Pro Val Cys Gly" 

935 940 945 

Ser Leu Arg Pro Pro lie Val He His Ser Ser Gly Lys Lys Tyr 

r-i «u 555 960 

Gly Phe Ser Leu Arg Ala He Arg Val Tyr Met Gly Asp Ser Asp 

965 970 975 

Val Tyr Thr Val His His Val Val Trp Ser Val Glu Asp Gly Ser 

980 985 990 
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Pro Ala Gin Glu Ala Gly Leu Arg Ala Gly Asp Leu lie Thr His 
995 1000 1005 

lie Asn Gly Glu Ser Val Leu Gly Leu Val His Met Asp Val Val 

1010 1015 1020 

Glu Leu Leu Leu Lys Ser Gly Asn Lys He Ser Leu Arg Thr Thr 

1025 1030 1035 

Ala Leu Glu Asn Thr Ser He Lys Val Gly Pro Ala Arg Lys Asn 

1040 1045 1050 

Val Ala Lys Gly Arg Met Ala Arg Arg Ser Lys Arg Ser Arg Arg 

1055 1060 1065 

Arg Glu Thr Gin Asp Arg Arg Lys Ser Leu Phe Lys Lys He Ser 

1070 1075 1080 

Lys Gin Thr Ser Val Leu His Thr Ser Arg Ser Phe Ser Ser Gly 

1085 1090 1095 

Leu His His Ser Leu Ser Ser Ser Glu Ser Leu Pro Gly Ser Pro 

1100 1105 1110 

Thr His Ser Leu Ser Pro Ser' Pro Thr Thr Pro Cys Arg Ser Pro 

1115 1120 1125 

Ala Pro Asp Val Pro Ala Asp Thr Thr Ala Ser Pro Pro Ser Ala 

1130 1135 1140 

Ser Pro Ser Ser Ser Ser Pro Ala Ser Pro Ala Ala Ala Gly His 

1145 1150 1155 

Thr Arg Pro Ser Ser Leu His Gly Leu Ala Ala Lys Leu Gly Pro 

1160 1165 1170 

Pro Arg Pro Lys Thr Gly Arg Arg Lys Ser Thr Ser Ser He Pro 

1175 1180 1185 

Pro Ser Pro Leu Ala Cys Pro Pro He Ser Ala Pro Pro Pro Arg 

1190. 1195 1200 

Ser Pro Ser Pro Leu Pro Gly His Pro Pro Ala Pro Ala Arg Ser 

1205 1210 1215 

Pro Arg Leu Arg Arg Gly Gin Ser Ala Asp Lys Leu Gly Thr Gly 

1220 . 1225 1230 

Glu Arg Leu Asp Gly Glu Ala Gly Arg Arg Thr Arg Gly Pro Glu 

1235 1240 1245 

Ala Glu Leu Val Val Met Arg Arg Leu His Leu Ser Glu Arg Arg 

1250 1255 1260 

Asp Ser Phe Lys Lys Gin Glu Ala Val Gin Glu Val Ser Phe Asp 

1265 1270 1275 

Glu Pro Gin Glu Glu Ala Thr Gly Leu Pro Thr Ser Val Pro Gin 

1280 1285 ^ 1290 

He Ala Val Glu Gly Glu Glu Ala Val Pro Val Ala Leu Gly Pro 

1295 1300 1305 

Thr Gly Arg Asp 



<:310> 5 

<211>. 1331 

<212> PRT 

<213> Homo sapiens 



• . <220> 

<221> misc„feature 

<223> Incyte ID No: 7526158CD1 

<400> 5 

Met Lys Ser Arg Arg Asp Lys Leu His lie Pro Ala Leu Thr Leu 
15 10 15 

Asp Leu Ser Pro Ser Ser Gin Ser Pro Ser Leu Leu Gly Pro Ser 
20 25 30 
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Ser Pro Cys Ser Pro Cys Ser Pro- Ser Leu Gly Leu His Pro Trp 
35 40 45 

Ser Cys Arg Ser Gly Asn Arg Lys Ser Leu Val Val Gly Thr Pro 
50 55 60 

Ser Pro Thr Leu Ser Arg Pro Leu Ser Pro Leu Ser Val Pro Thr 
65 70 75 

Ala Gly Ser Ser Pro Leu Asp Ser Pro Arg Asn Phe Ser Ala Ala 
80 85 90 

Sfer Ala Leu Asn Phe Pro Phe Ala Arg Arg Ala Asp Gly Arg Arg 
95 100 105 

Trp Ser Leu Ala Ser Leu Pro Ser Ser Gly Tyr Gly Thr Asn Thr 

110 115 120 

Pro Ser Ser , Thr Leu Ser Ser Ser Ser Ser Ser Arg Glu Arg Leu 
/ 125 130 135 

His Gin Leu Pro Phe Gin Pro Thr Pro Asp Glu Leu His Phe Leu 

140 145 150 

Ser Lys His Phe Arg Ser Ser Glu Asn Val Leu Asp Glu Glu Gly • 

155 160 165 ' 

Gly Arg Ser Pro Arg Leu Arg Pro Arg Ser Arg Ser Leu Ser Pro 

170 175 180 

Gly Arg Ala Thr Gly Thr Phe Asp Asn Glu lie Val Met Met Asn 

185 190 195 

His Val Tyr Arg Glu Arg Phe Pro Lys Ala Thr Ala Gin Met Glu 

200 205 210 

Gly Arg Leu Gin Glu Phe Leu Thr Ala Tyr Ala Pro Gly Ala Arg 

215 220 225 

Leu Ala Leu Ala Asp Gly Val Leu Gly Phe He His His Gin He 

230 235 240 

Val Glu Leu Ala Arg Asp Cys Leu Ala Lys Ser Gly Glu Asn Leu 

245 250 255 

Val Thr Ser Arg Tyr Phe Leu Glu Met Gin Glu Lys Leu Glu Arg 

260 265 270 

Leu Leu Gin Asp Ala His Glu Arg Ser Asp Ser Glu Glu Val Ser 

275 280 285 

Phe He Val Gin Leu Val Arg Lys Leu Leu He He He Ser Arg 

290 295 300 

Pro Ala Arg Leu Leu Glu Cys Leu Glu Phe Asp Pro Glu Glu Phe 

305 310 315 

Tyr Hi3 Leu Leu Glu Ala Ala Glu Gly His Ala Arg Glu Gly Gin 

320 325 330 

Gly He Lys Thr Asp Leu Pro Gin Ty?: He He Gly Gin Leu Gly 

335 340 345 

Leu Ala Lys Asp Pro Leu Glu Glu Met Val Pro Leu Ser His Leu 

350 . ->355- - - , ^ . 360. — - . ► - 

Glu Glu Glu Gin Pro Pro Ala Pro Glu Ser Pro Glu Ser Arg Ala 

365 370 375 

Leu Val Gly Gin Ser Arg Arg Lys Pro Cys Glu S.er Asp Phe Glu 

380 385 . 390 

Thr He Lys Leu He Ser Asn Gly Ala Tyr Gly Ala Val Tyr Leu 

395 400 405 

Val Arg His. Arg Asp Thr Arg Gin Arg.pjj^ ftj^ He Lys Lys He 

410 415 420 ' ....... 

Asn Lys Gin Asn Leu He Leu Arg Asn Gin He Gin Gin Val Phe 

425 430 435 

Val Glu Arg Asp He Leu Thr Phe Ala Glu Asn Pro Phe Val Val 

440 445 450 

Ser Met Phe Cys Ser Phe Glu Thr Arg Arg His Leu Cys Met Val 

455 460 465 

Met Glu Tyr Val Glu Gly Gly Asp Cys Ala Thr Leu Leu Lys Asn 
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470 475 
Met Gly Pro Leu Pro Val Asp Met Ala Arg Leu Tyr Phe Ala Glu 
485 490 495 

Thr Val Leu Ala Leu Glu Tyr Leu His Asn Tyr Gly lie Val His 

505 510 
Arg Asp Leu Lys Pro Asp Asn Leu Leu lie Thr Ser Leu Gly His 
515 520 525 

He Lys Leu Thr Asp Phe Gly Leu Ser Lys He Gly Leu Met Ser 
530 535 540 

Met Ala Thr Asn Leu Tyr Glu Gly His lie Glu Lys Asp Ala Arg 

545 550 
Glu Phe He Asp Lys Gin Val Cys Gly Thr Pro Glu Tyr He Ala 
560 565 570 

Pro Glu val lie Phe Arg Gin Gly Tyr Gly Lys Pro Val Asp Trp 
575 580 585 

Trp Ala Met Gly Val Val Leu Tyr Glu Phe Leu Val Gly Cys Val 
590 595 600 

Pro Phe Phe Gly Asp Thr Pro Glu Glu Leu Phe Gly Gin Val Val 
605 610 63^5 

Ser Asp Glu He Met Trp Pro Glu Gly Asp Glu Ala Leu Pro Ala 
^20 625 630 

Asp Ala Gin Asp Leu lie Thr Arg Leu Leu Arg Gin Ser Pro Leu 
635 640 645 

Asp Arg Leu Gly Thr Gly Gly Thr His Glu Val Lys Gin His Pro 
«. «55 660 

Phe Phe Leu Ala Leu Asp Trp Ala Gly Leu Leu Arg His Lys Ala 
665 670 675 

Glu Phe Val Pro Gin Leu Glu Ala Glu Asp Asp Thr Ser Tyr Phe 
, ^ 685 690 

Asp Thr Arg Ser Glu Arg Tyr Arg His Leu Gly Ser Glu Asp Asp 
695 700 705 

Glu Thr Asn Asp Glu Glu Ser Ser Thr Glu He Pro Gin Phe Ser 

710 715 
Ser Cys Ser His Arg Phe Ser Lys Val Tyr Ser Ser Ser Glu Phe 

725 730 735 . 

Leu Ala Val Gin Pro Thr Pro Thr Phe Ala Glu Arg Ser Phe Ser 
740 745 750 

Glu Asp Arg Glu Glu Gly Trp Glu Arg Ser Glu Val Asp Tyr Gly 
755 760 765 

Arg Arg Leu Ser Ala Asp lie Arg Leu Arg Ser Trp Thr Ser Ser 

770 775 
Gly Ser Ser Cys Gin Ser Ser Ser Ser Gin Pro Glu Arg Gly Pro 
785 790 795 

Ser Pro Ser Leu Leu Asn Thr lie Ser . Leu Asp.Thr JIet -Pro Lys - 

800 805 810 

Phe Ala Phe Ser Ser Glu Asp Glu Gly Val Gly Pro Gly Pro Ala 
815 ' 820 825 

Gly Pro Lys Arg Pro Val Phe He Leu Gly Glu Pro Asp Pro Pro 
830 835 840 

Pro Ala Ala Thr Pro Val Met Pro Lys Pro Ser Ser Leu Ser Ala 
i • • • 845 850 855 

Asp Thr Ala Ala Leu Ser His Ala Arg Leu Arg Ser Asn Ser lie 
860 865 870 

Gly Ala Arg His Ser Thr Pro Arg Pro Leu Asp Ala Gly Arg Gly 
875 880 885 

Arg Arg Leu Gly Gly Pro Arg Asp Pro Ala Pro Glu Lys Ser Arg 
o - 895 900 

Ala ser Ser Ser Gly Gly Ser Gly Gly Gly Ser Gly Gly Arg Val 
905 910 915 



12 



PF-1688 P 



Pro Lys Ser Ala Ser Val Ser Ala I*eu Ser Leu lie lie Thr Ala 

920 925 930 

Asp Asp Gly Ser Gly Gly Pro Leu Met Ser Pro Leu Ser Pro Arg 

935 940 945 

Ser Leu Ser Ser Asn Pro Ser Ser Arg Asp Ser Ser Pro Ser Arg 

950 955 960 

Asp Pro Ser Pro Val Cys Gly Ser Leu Arg Pro Pro He Val He 

965 970 975 

His Ser Ser Gly Lys Lys Tyr Gly Phe Ser Leu Arg Ala He Arg 

980 985 990 

Val Tyr Met Gly Asp Ser Asp Val Tyr Thr Val His His Val Val 
995 1000 1005 

Trp Ser Val Glu Asp Gly Ser Pro Ala Gin Glu Ala Gly Leu Arg 

1010 1015 1020 

Ala Gly Asp Leu He Thr His He Asn Gly Glu Ser Val Leu Gly 

1025 1030 1035 

Leu Val His Met Asp Val Val Glu Leu Leu Leu Lys Ser Gly Asn 

1040 -1045 1050 

Lys He Ser Leu Arg Thr Thr Ala Leu Glu Asn Thr Ser He Lys 

1055 1060 1065 

Val Gly Pro Ala Arg Lys Asn Val Ala Lys Gly Arg Met Ala Arg 

1070 1075 1080 

Arg Ser Lys Arg Ser Arg Arg Arg Glu Thr Gin Asp Arg Arg Lys 

1085 1090 1095 

Ser Leu Phe Lys Lys He Ser Lys Gin Thr Ser Val Leu His Thr 

HOO 1105 1110 

Ser Arg Ser Phe Ser Ser Gly Leu His His Ser Leu Ser Ser Ser 

1115 ' 1120 1125 

Glu Ser Leu Pro Gly Ser Pro Thr His Ser Leu Ser Pro Ser Pro 

1130 1135 1140 

Thr Thr Pro Cys Arg Ser Pro Ala Pro Asp Val Pro Ala Asp Thr 

1145 1150 1155 

Thr Ala Ser Pro Pro Ser Ala Ser Pro Ser Ser Ser Ser Pro Ala 

1160 1165 1170 

Ser Pro Ala Ala Ala Gly His Thr Arg Pro Ser Ser Leu His Gly 

1175 1180 1185 

Leu Ala Ala Lys Leu Gly Pro Pro Arg Pro Lys Thr Gly Arg Arg 

1190 1195 1200 

Lys Ser Thr Ser Ser He Pro Pro Ser Pro Leu Ala Cys Pro Pro 

1205 1210 1215 

He Ser Ala Pro Pro Pro Arg Ser Pro Ser Pro Leu Pro Gly His 

1220 1225 1230 

Pro Pro Ala Pro Ala Arg Ser Pro Arg Leu Arg Arg Gly Gin Ser 

1235 - 1240 - 1245^ 

Ala Asp Lys Leu Gly Thr Gly Glu Arg Leu Asp Gly Glu Ala Gly 

1250 1255 . 1260 

Arg Arg Thr Arg Gly Pro Glu Ala Glu Leu Val Val Met Arg Arg 

1265 1270 .1275 

Leu His Leu Ser Glu Arg Arg Asp Ser Phe Lys Lys Gin Glu Ala 

1280 1285 1290 

.Val.Gln Glu Val Ser Phe.Asp Oiu Pxo. Gin G.lu Glu Ala Thr Gly 

1295 1300 1305 

Leu Pro Thr Ser Val Pro Gin He Ala Val Glu Gly Glu Glu Ala 

1310 1315 1320 

Val Pro Val Ala Leu Gly Pro Thr Gly Arg Asp 

1325 1330 

<210> 6 
<211> 3912 
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<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_£eature 

<223> Zncyte ID No: 2509577CB1 

<400> 6 

C99cattggc ctggccagcff ctggttggtg gcctggagga ggatttgatt 60 
laSJ?^!^ cggtgcagct ggccgcggtg tccctgagaa aagtcagaaa tggccaatga 120 
^^"^^^^^^^ atgcgatgtt aattctgggg cattgatgtt ttacaatgcc 180 
tgatcaagat aaaaaggtga agaccacaga aaaatcaact gataaacagc aagaaatcac 240 
acaacaS;^ *^^^'^--sratc ttaaaagact tcggtgcctt ttgaacgtcc aaLaagcaa 300 
acaacagctt ccagccatta acttcgatag tgcccaaaat agcatgacga agtctgagcc 360 
cgccatcagg gcgggtggac acagagctcg gggtcagtgg catgaatcca cagaagctgt 420 
lo^l^^ -ttttagta taaactacaa gaa.gagaga aatttcagca aaca^ccSa tso 
!^f««^^^ tttcaggaga tctttaccgc cttggtgaaa aatagactca taagcagaga 540 
JaSagg::^ lllllllT tctgagagtg ttaatdtgtc tgaggcLct 600 

t«gtaSta oaaa?^^ aggaaatact ccatagcttg ggtgggattg aaaacctagc 660 
allllt^ll gagattgtag coaatgagta cctcggctat ggagaagagc agcacactgt 720 
aJaataaoS f'^'^^^^^^ ^^'^^^^"^ tcaaaaactt gctgcagtca aagatcaaag 780 
ll^tlm. *f*=f*=^^^*^sr gagcccacaa gacattagta aatttacttg gtgcccgaga 840 
tactaatgtt ctattgggtt cccttctggc tctggctagt ttagcagaaa gtcaagLtg 900 
lllllllllt ^^^^^^^^^^ tcaacattgt agaaaatctg ttgatgattt tacahgaata 960 
S?S^f°!! '°^*f««arac taacagcgga gttgctgcgc ctactttgtg cagagcccca 1020 

tctatgaggg gataccggtc ctcctcagtc tgctccactc 1080 
tgaccacttg aagctcctct ggagcattgt ctggattctg gtacaggttt gtgaggaccc 1140 
aaaaaac"^" ^tggaaattc gcatttgggg aggcatcaaa cagctSttc JtatJKaca ^2^2 
llllllltT. ""^S"*^ ctgatcactc ctccattgga agcctgtcca gtgcaaatgc 1260 
■ ilatacttS lllZltT tstcatttatc agaagacttg agccctaggg aaatacaaga 1320 
^Itm tcacttcaag cagcctgctg tgctgccctc actgagctgg tgctcaatga 1380 
m^tl^ caccaggtgg ttcaggaaaa tggtgtatat acaatagcaa aattaatttt 1440 
cttgaoatS ctchtJ^o^^ cagcaaaaag taatctatta cagtgttatg ctttcagago 1500 
a««o^^«^^^ ctcttcagta tggaaagaaa cagaccactc tttaaaagac ttttcccoac 1560 
aaaatS^J^ S^gatcttca ttgacatagg gcattatgta cgtgatatca gtgcttatga 1620 
agaattggta tccaagctga atttattagt ggaggatgaa ctgaagcaaa ttgctgaaaa 1680 
^a^f^^^f^ attaatcaga acaaagctcc tttgaaatat ataggcaact atgcaatttt 1740 
aaatctttta IHT^^^^ cttttggctg tgtttacaag gttagaaagc atagtggtca 1800 
agatcaaaan llttTt^^ aggtcaattt acataaccca gcatttggaa aggataagaa 1860 
agcagcgtaa ggaahattgt ttctgaatta acaataatta aagagcagct 1920 
a«J^!f aacattgtac gttattacaa aacatttctg gaaaacgata ggttgtacat 1980 
acatcaJca? lllttt^^^ sagccccgct tggagagcat ttcagttctt tgaaggaaaa 2040 

aaagactatg gaaaatattt atacagctgt gcttagctct 2100 

tcgatactta caoaaggaga agaggattgt ccatagagat ctgacaccaa-acaacattat--2160 - - 

alacaaSaa tt^^'^^t^ taacagttac tgactttggc ctggcaaagc aaaaacaaga 2220 
!a!^™i ctcacgtctg tggttggaac aatcctgtat tcttgccccg aggtactgaa 2280 
f^oif^fn!^ tatggggaga aggctgatgt ctgggcagta ggctgcatcc tttatcagat 2340 
lllllT agtcccccct tctacagcac taacatgctg tccttggcta caaaaatagt 2400 
llllimlt ''f S^^'^^^S tgccagaagg tatctactct gaaaaagtaa cagacaccat 2460 ' 
gataSf™^ !^''^ atgcggaagc tcgtccagat attgtagaag tcagttcgat 2520 

fa;«!r^^ gtcatgatga. aatatttaga caacttatct acatcccagt tgtccttgga 2580 
ItltltJ-lT gaagacgoac acaaaggtat "tbtatggaag ccaaccggaa 2640 

tttaaataac agctggctgt tctatctcac gagacctttg agaaggSag 2700 

t ^5°«9=^9tg gagcagccag cctgaaaagt gaactttcag aaagcgcaga 2760 
aatcctSci lltiT'"'^. aggcctccta tggtaaagac gaagacaggg cctgtgacja 2820 
aaat™?™! S^tgataact tcaacotgga aaatgctgag aaagatacat attcagaggt 2880 
aStfJSn cggataaotc cagcagctcc agttcaagcc otctgaaaga 2940 

aacaa««~^ aacattttaa agagaagttt tagtgcttca ggaggagaaa gacaatccca 3000 
aacaagggac ttcactggag gaacaggatc aagaccaaga ccagggccac agatgggcac 3060 



14 



PP-1688 P 



attcttgtgg caagcatcag caggaattgc tgtgtcccag aggaaagtgc gtcagatcag 3120 
tgatcctatt cagcagatat taattcagct gcacaaaata atctatatca cacagcttcc 3180 
tccagctttg caccacaatt tgaaaagaag ggttatagag agattcaaga aatccctctt 3240 
cagccagcag agtaaccctt gtaatttgaa atctgaaatt aaaaagttat ctcagggatc 3300 
tccagaaccg attgagccca actttttcac agcagattac catttattac atcgttcatc 3360 
cggtggaaac agcctgtccc caaatgaccc tacaggttta ccaaccagca ttgaattgga 3420 
ggaaggaata acatatgaac agatgcagac tgtgattgaa gaagtccttg aggaaagtgg 3480 
ctattacaat tttacatcta acaggtatca ttcctatcca tgggggacca agaatcaccc 3540 
aaccaaaaga tgaaaatgct gcattttgag tggacttgat tttctcagtg aagttcaagt 3600 
tctggacttc agccgctatt gcaagatgcc caaggattgg gtgctgctag agggtgtgga 3660 
aaagaccaag atgccatggg gcctgcagga cttctttctg ggggtcctgt gctggagtat 3720 
atgacagctg cggtacttga gggcttcatt gccagaacac attatataca ggatgtcaga 3780 
gctaccagtg tgctgctggg agaaaatgct gcaaaattca tcttttggag ggtgggggga 3840 
aaacccaaaa acaacaacaa aaaaactctc ttacagaatt ttccttaaca ttaaaaaaaa 3900 
cttgtcatat tt 332^2 

<210> 7 

<211> 3229 ^ 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 
■<223> Incyte ID No: 7505222CB1 

<400> 7 

gtagcataaa gaggaaatag aaggaaagca ataagtaaga aagtacatat ttcaatctga 60 
aaatgcttgg cactactacc cttggaaaat gtagagaagt agccagtagc cgcgcctggg 120 
gagtcgcctg aacgtgacgg cagcaaatgc agattgttgg gtctccggga ccaggagcag 180 
cgtggccagt gaagcgcgtg gttttcccaa atggtgaaca attcttgtta tctgtggcca 240 
caaagaaagt tatttgtctc tgtcttggca aggctgggag gaaagtttta gctaagaaac 300 
tcagcccatt ggagaccatg gataagtacg atgtgattaa ggccatcggg caaggtgcct 360 
tcgggaaagc atacttagct aaagggaaat cagatagcaa gcactgtgtc ataaaagaga 420 
tcaattttga aaagatgccc atacaagaaa aagaagcttc aaagaaagaa gtgattcttc 480 
tggaaaagat gaaacatccc aacattgtag ccttcttcaa ttcatttcaa gagaatggca 540 
ggctgtttat tgtaatggaa tattgtgatg gaggggatct catgaaaagg atcaatagac 600 
aacggggtgt gttatttagt gaagatcaga tcctcggttg gtttgtacag atttctctag 660 
gactaaaaca tattcatgac aggaagatat tacacaggga cataaaagct cagaacattt 720 
.ttcttagcaa gaacggaatg gtggcaaagc ttggggactt tggtatagca agagtcctga 780 
ataattccat ggaacttgct cgaacttgta ttggaacacc ttactacctg tccccagaga 840 
tctgtcagaa taaaccctac aacaataaaa cggatatttg gtctcttggc tgtgtcttat 900 
atgagctctg cacacttaaa catccttttg agggtaacaa cttacagcag ctggttctga 960 
agatttgtca agcacatttt gccccaatat ctccggggtt ttctcgtgag ctccattcct 1020 
tgatatctca gctctttcaa gtatctcctc gagaccgacc. atccataaat ^tccattttga-~1080- 
aaaggccctt tttagagaat cttattccca aatatttgac tcctgaggtc attcaggaag 1140 
aattcagtca catgcttata tgcagagcag gagcgccagc ttctcgacat gctgggaagg 1200 ■ 
tggtccagaa gtgtaaaata caaaaagtga gattccaggg aaagtgccca ccaagatcaa 1260 
ggatatctgt gccaattaaa aggaatgcta tattgcatag aaatgaatgg agaccaccag 1320' 
ctggagccca gaaggccaga tctataaaaa tgatagaaag acccaaaatt gctgctgtct 1380 
gtggacatta tgattattat tatgctcaac ttgatatgct gaggaggaga gcccacaaac 1440 
<?Aagtt.at<?^ ccctattcct caagaaaata ctggagttga ggattacggt caggaaacga 1500 
ggcatggtcc atccccaagt caatggcctg ctgagtacct tcagagaaaa' tttgaagctc 1560 
aacaatataa gttgaaagtg gagaagcaat tgggtcttcg tccatcttct gccgagccaa 1620 
attacaacca gagacaagag ctaagaagta atggagaaga gcctagattc caggagctgc 1680 
.catttaggaa aaacgaaatg aaggaacagg aatattggaa gcagttagag gaaatacgcc 1740 
aacagtacca caatgacatg aaagaaatta gaaagaagat ggggagagaa ccagaggaga 1800 
actcaaaaat aagtcataaa acctatttgg tgaagaagag taacctgcct gtccatcaag 1860 
atgcatctga gggagaagca cctgtgcagg acattgaaaa agacttgaaa caaatgaggc 1920 
ttcagaacac aaaggaaagt aaaaatccag aacagaaata taaagctaag ggggtaaaat 1980 
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ttgaaattaa tttagacaaa tgtatttctg atgaaaacat cctccaagag gaagaggcaa 2040 
tggatatacc aaatgaaact ttgacctttg aggatggcat gaagtttaag gaatatgaat 2100 
gtgtaaagga gcatggagat tatacagaca aagcatttga aaaacttcac tgcccagaag 2160 
cagggttttc cacgcagact gtagctgctg tgggaaacag gaggcagtgg gatggaggag 2220 
cgcctcagac tctgctgcag atgatggcag tggccgacat cacctccacc tgccccacgg 2280 
ggcctgacaa tggccaagtt attgtgattg aaggcattcc aggaaacagg aaacagtggc 2340 
ggcatgaagc tccaggaact ttaatgagtg ttttggcagc agcacatcta acgagtagct 2400 
cattttctgc cgatgaagaa tttgcaatgg gaacattaaa acaatggcta cccaaagaag 2460 
aagatgaagg gaaggtagaa atggtctctg gcattgaagt agatgaggaa caactagaac 2520 
caagatctga tgatgatgat acaaattttg aagaatctga agatgagttg agagatgaag 2580 
tagtagaata cttagaaaaa ctcgctactt tcaaagggga agaaaaaaca gaagaggcct 2640 
ccagtacctc taaggactct agaaagtcaa gagaaagaga ggggataagt atgcagaaat 2700 
ctgaagaatt aagggagggc ttggagaata tttctactac atctaatgac cacatttgta 2760 
ttactgatga agaccaagga acatcaacaa ccagtcaaaa tatacaagtg tgattattgt 2820 
actttttctt aagtaataag ttagtgtcta ttacctatag tatttatttg ggtacaagtc 2880 
ataaatgctc atttactgta agggttttct agtaatctca aggatttatt aatttttctt 2940 
tcaatttagg aagtagaact tttgaatata gccattaata tttttacttt aaagtttcta 3000 
ttaagaaatc ttaggccggg cagtctcatc actttgggag gccaaggcag gcagatcatg 3060 
aggtcaggag ttgagaccag tccaaccaac atggtgaaac cccgtctcta ctaaaaatac 3120 
aaaattagct gggcatggtg gtgcatgcct gtaatcccag ttacttggga ggctgaggca 3180 
ggagaatcac ctgaacccag gagatggaag ttgcaagtga gccgagatg* 3229 

<210> 8 

<211> 2100 

^212> DNA 

<213> Homo sapiens 

<220> 

. <221> itiisc^feature 

<223> Incite ID No: 752440BCB1 

<400> 8 

tagagcattg ccttgttgct gacctttcag tatggggagg attggcatct cctgtctttt 60 
tcctgcttct tggcatttta gcatatctcc agtagggtgt cctcgaattc tgaataccaa 120 
tttacgccaa attatggtca ttagtgtcct ggctgctgct gtttcacttt tatatttttc 180 
tgttgtcata atccgaaata agtatgggcg actaaccaga gacaagaaat ttcaaaggta 240 
cctggcacga gttaccgaca ttgaagctac agacaccaat aaccccaatg tgaactatgg 300 
gatcgtggtg gactgtggta gcagtgggtc tcgagtattt gtttactgct gfgccaaggca 360 
taatggcaat ccacatgatc tgttggatat caggcaaatg agggataaaa accgaaagcc 420 
agtggtcatg aagataaaac cgggcatttc agaatttgct acctctccag agaaagtcag 480 
tgattacatt tctccacttt tgaactttgc tgcagagcat gtgccacggg caaaacacaa 540 
agagacacct ctctacattc tctgcacggc tggaatgaga atcctccct:g aaagccagca 600 
gaaagctatt ctggaagacc ttctgaccga tatccccgtg cactttgact ttctgttttc 660 
tgactctcat gcagaagtaa tttctgggaa acaagaags[t gtgtatgctt .ggattggcat. .720-, - 
- ta&ttttgtc cttggacgat ttgagcatat tgaagatgat gatgaggccg ttgtggaagt 780 
taacattcct ggaagtgtaa gcagcgaagc cattgtccgt aaaaggacag cgggcattct 840 
cgacatgggq ggcgtgttga ctcagatagc gtacgaagtc cccaaaactg caagctttgc 900 
gtcctcacag caggaagaag tagctaaaaa cttgttagct gaatttaact tgggatgtga 960 
tgttcaccaa actgagcatg tgtatcgagt ctatgtggcc acgttttttg ggtttggtgg 1020 
caatgctgct cgacagagat acgaagacag aatatttgcc aacaccattc aaaagaacag 1080 
gctc^tgggt; aaacagactg..gtctgactcc tgatatgccg tacttggacc cctgcctacc 1140 
cctagacatt aaagatgaaa tccagcaaaa tggacaaacc atatacctai gagggactgg 1200 
agactttgac ctgtgtcgag agactatcca gcctttcatg aataaaacaa acgagaccca 1260 
gacttccctc aatggggtct accagccccc aattcacttc cagaacagtg aattctatgg 1320 
cttctccgaa ttctactact gcaccgagga tgtgttacga atggggggag actacaatgc 1380 
tgctaaattt actaaagctg caaaggatta tt^tgcaaca aagtggtcca ttttgcggga 1440 
acgctttgac cgaggactgt acgcctctca tgctgacctc cacaggctta agtgaactgc 1500 
tccatgttgt gggaccaggg atgtgaagca agtacatcaa ccttgaaacg cgcgcagttg 1560 
gtttgggaga gccggcctga ggaagcccct gccccaaggc tgcccacaga gggaaggatt 1620 
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gtgtgtgtgt gtgtgtgtgt gccatcttga cagagtgagt cagtctgact tgcctttgtg 1680 
cttgccgttt gtaggtatca gtgcttcaaa tcggcctgga tgtttgaggt gtttcatagg 1740 
ggcttttcgt ttcctgtcaa ctataaaagc ttaaagactg ccttgcaagt: ttacgacaag 1800 
gaggttcage ggacccttgg agccatcctic tacaggaccc gctttctacc attaagagac 1860 
atccagcagg aggccttccg agccagtcac acccactggc ggggcgtttc ctttgtctac 1920 
aaccactacc tgttctctgg ctgcttcctg gtggtgctgc tggccatcct gctgtacctg 1980 
ctgcggctgc ggcgcatcca caggcgcact ccccggagca gctcggccgc cgccctctgg 2040 
atggaggagg gccttcccgc ccagaatgcc ccggggacct tgtgatccag ctcacagcta 2100 

<210> 9 

<211> 4213 

<212> DMA 

<213> Homo sapiens 

<220> 

<221> misc^feature 

<223> Incyte ID No: 7526163CB1 

<400> 9 

cccgggccat ggacgagtcg agcctcctgc ggcgccgcgg gctccagaag gagctgagcc 60 
tgccacgccg aggacgtggc tgccgcagcg ggaaccgcaa gagcttggtg gtaggaacgc 120 
cctccccgac cctctcccgg cccctgtcgc cattgtcggt cccaacggca ggcagcagcc 180 
,ccttggatag tcctcggaat ttctcggctg cctctgccct aaatttcccc tttgcccgga 240 
gggcagacgg cagaagatgg tccctcgcgt ctctcccatc ttccggct^it ggaaccaaca 300 
cacccagctc caccctctcg tcaagctcat cctcccggga acgtctccac cagcttccct 360 
tccagccgac gccggacgag ctgcacttcc tgtccaagca cttccgcagc tcagagaatg 420 
tgcttgatga ggaaggcggc cggtcacccc gcctccgacc ccgctctcgc agtctcagcc 480 
cgggccgtgc aacggggacc ttcgacaatg agattgtcat gatgaatcac gtgtaccggg 540 
agaggttccc caaggccaca gcacagatgg agggccgtct gcaggagttc ctgacggcct 600 
acgcgcccgg cgcccggctg gcgctggctg atggcgtctt gggcttcatc caccaccaga 660 
tcgtcgagct ggcccgagac tgcttggcca agtctggcga gaacctcgtc acctcccgct 720 
acttcctaga gatgcaggag aagctggagc ggcttctgca ggatgcccat gagcgttcgg 780 
acagtgagga ggtcagcttc atcgtccagc ttgtccggaa actgctgatc atcatctcac 840 
ggccagctcg gctgctggag tgtctggagt ttgaccctga ggaattttac cacctgctgg 900 
aggcggctga gggccatgcg cgggagggcc aaggcattaa gactgacctt ccacagtaca 960 
tcattgggca gctgggcctg gccaaggacc ccctggagga gatggtgcca ctgagtcacc 1020 
tcgaagaaga acagccccca gcacctgagt ccccagagag ccgcgccctg gtcggccagt 1080 
cacggaggaa gccatgcgaa agcgactttg agaccatcaa actcattagc aacggagcct 1140 
atggggccgt ctacctggtg cggcaccgtg acacacggca gcgctttgcc atcaagaaga 1200 
tcaacaaaca gaacttgatc ctgcgtaacc aggtccagca ggtctttgtg gagcgtgaca 1260 
ttctcacctt tgccgagaac ccctttgtgg tcagcatgtt ctgctccttt gagacccggc 1320 
gccacctatg tatggtcatg gaatacgtgg aaggcggcga ctgcgccacg ctcctgaaga 1380 
acatgggccc gctgcccgtg gacatggccc gcctgtactt cgccgagacg gtgttggcgc 1440 
tggagtacct gcataactat ggcatcgtgc accgtgacct caaaccagac aatctgc tea -^1500-. 
tcacctcgct tggccacatc aagctcacgg acttcggcct gtccaagatc ggcctcatga 1560 
gcatggccac caacctctat gagggccaca tcgagaagga cgcccgagag ttcatcgaca 1620 
agcaggtgtg tgggacgccg gagtacatag cccccgaggt gatcttccgc cagggctatg 1680 
ggaagccagt ggactggtgg gccatgggcg tcgtcctcta tgagtttctg gtgggctgcg 1740 
tgcctttctt tggagatacc cccgaggaac tcttcggtca ggtggtcagc gatgagatca 1800 
tgtggccaga gggagatgag gcccttccag cagacgccca ggacctcatc accaggttgc 1860 
tccggcagag .c^r.cggtgg^c cgtctgggca ctggtggcac ccacgaagtg aagcagcacc 1920 
cctttttcct ggccctggac tgggcagggc ttctccgaca c^aagccgag ttcgtgcc'cc' 1980 
agctcgaagc cgaggatgat accagctact ttgacacacg ttcggaacgt taccgccatc 2040 
tgggctccga ggacgacgag accaatgatg aagaatcgtc cacagagatc ccccagttct 2100 
cctcctgctc ccaccggtbc agcaaggtct acagcagctc tgagttcctg gccgtccagc 2160 
ccactcctac cttcgctgaa aggagcttca gtgaagaccg ggaggagggg tgggagcgca 2220 
gcgaagtgga ctatggccgc cggctgagtg ctgacatccg gctgaggtcc tggacatcct 2280 
ctggatcctc ctgtcagtca tcttcgtccc agcccgagcg gggtcccagc ccatctctcc 2340 
tgaataccat cagcctggac acaatgccca agtttgcctt ctcatcagag gatgaggggg 2400 
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taggcccagg ccctgcaggc cccaagaggc ccgtcttcat tctaggggag cctgaccccc 2460 
caccagcggc caccccagtg atgcccaagc cctcgagcct ttctgccgac acagctgctc 2520 
tcagccacgc ccgcctacgg agcaatagca tcggcgcccg acactccaca ccaaggcctc 2580 
tggatgccgg ccggggccgc cgccttgggg gcccaagaga cccagcccct gagaagtcca 2640 
gagcctcctc cagcggtggc agtggtggcg gcagtggggg ' ccgcgtgccp aagtcagcct 2700 
ctgtctctgc cctgtccctc atcatcacgg cagatgatgg cagcggcggc cccctcatga 2760 
gccccctttc cccgcgctct ctgtcctcga acccgtcgtc ccgtgactct tcgccgagcc 2820 
gagacccgtc ccccgtgtgt ggcagcctgc ggccccccat cgttatccac agctctggca 2880 
agaagtacgg cttcagcctg cgggcgatcc gcgtctacat gggtgatagc gacgtctaca 2940 
ctgtgcacca cgtcgtctgg agtgtggagg acggaagccc cgcccaggag gcgggcctgc 3000 
gggctgggga cctcatcacc cacatcaacg gggagtcagt gctggggctg gtgcacatgg 3060 
acgtcgtgga gctgctgctg aagagcggca acaagatatc cctgcggacc acagccctgg 3120 
agaacacctc catcaaggtg ggccccgccc ggaagaatgt ggccaagggc cgcatggcac 3180 
gcaggagcaa gaggagccgt cggcgggaga cccaggatcg gcggaagtca ctttitcaaga 3240 
agatctccaa gcagacctcc gtgctgcaca ccagccgcag cttctcctcc ggactccacc 3300 
actcactgtc atccagtgag agcctccccg gctcgcccac ccacagcctc tcccccagcc 3360 
ccaccactcc ctgccgaagc ccagcccctg atgtcccagc agataccact gcatccccac 3420 
ccagcgcatc cccgagctcc agcagccccg cctccccagc tgctgctggc cacacccgcc 3480^ 
ccagctccct gcacggcctg gctgccaagc ttgggccacc ccgccccaag actggccgcc 3540- 
gcaagtccac cagcagcatc ccgccctccc cgctggcctg cccgcccatc tccgcgcccc 3600 
caccccgctc gccctcgccc ctgcccgggc acccgcccgc acctgcccga tccccgcggc 3660 
tgcgccgggg ccagtcagct gacaagctgg gcacagggga gcggctggat ggggaggcgg 3720 
ggcggcgcac tcgtgggcca gaggccgagc tcgtggtcat gcggcggctg cacctgtccg 3780 
agcgccgaga ctcctitcaag aagcaggagg ccgtgcagga ggttagcttc gatgagccgc 3840 
aggaggaggc cactgggctg cccacctcag tgccacagat cgccgtggag ggcgaggaag 3900 
ccgtgccagt agctctcggg cccaccggaa gagactgatc ccctgccagg tctctccctg 3960 
gcatcaaagt tacgcgtttt cttgtgcaat gttttttccg taaagtcatg cctggatggg 4020 
gactgagcca ccagcctgac acccagaagg cgagaagcca tctcggtcct tgctggaagg 4080 
tggagacatc gcttgtgttc tggtgtcaat cagggggctg g$itggggcaa gaatggggga 4140 
caagggtggc tttgtaaata gcagcaaatc cctgcaacta atttattact ttttcttttt 4200 
tttttttttt ttt 4213 

<210> 10 

<211> 5991 ' 

<212> DNA . . 

<213> Homo sapiens 

<220> 

<221> misc_feature 

<223> Incyte ID No: 7526158CB1 

<400> 10 

gtaccccgcc gcacagaggc ggcctctgcc tcggcatgaa gtcccgcagg gacaagctgc 60 

acatcccggc gctgaccctc gatctgtctc cgagcagcca .gagcccgtc.c. ct;gctggg.tc. IZO- - 

ccagcagccc ctgcagcccc tgtagcccct ccttgggcct gcacccctgg agctgccgca 180 
gcgggaaccg caagagcttg gtggtaggaa cgccctcccc gaccctctcc cggcccctgt 240 
cgccattgtc ggtcccaacg gcaggcagca gccccttgga tagtcctcgg aatttctcgg 300 
ctgcctctgc cctaaatttc ccctttgccc ggagggcaga cggcagaaga tggtccctcg 360 
cgtctctccc atcttccggc tatggaacca acacacccag ctccaccctc tcgtcaagct 420 
catcctctcg ggaacgtctc caccagcttc ccttccagcc gacgccggac gagctgcact 480 
tcctgt;qpaa gwgttccgc . agctcagagja atgtgrpttcra tgaggaaggc ggccggtcac 540 
cccgcctccg accccgctct cgcagtctca gcccgggccg tgcaa'cgggg accttcgaca 600 
atgagattgt catgatgaat catgtgtacc gggagaggtt ccccaaggcc acagcacaga 660 
tggagggccg tctgcaggag tt:cctgacgg cctacgcgcc cggcgcccgg ctggcgctgg 720 
Gtgabggcgt cttgggcttc atccaccacc agatcgtcga gctggcccga gactgcttgg 780 
ccaagtctgg cgagaacctc gtcacctccc gctacttcct agagatgcag gagaagctgg 840 
agcggcttct gcaggatgcc catgagcgtt cggacagtga ggaggtcagc ttcatcgtcc 900 
agcttgtccg gaaactgctg atcatcatct cacggccagc tcggctgctg gagtgtctgg 960 
agtttgaccc tgaggaattt taccacctgc tggaggcggc tgagggccat gcgcgggagg 1020 
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gccaaggcat taagactgac cttccacagt acatcattgg gcagctgggc ctggccaagg 1080 
accccctgga ggagatggtg ccactgagtc acctcgaaga agaacagccc ccagcacctg 1140 
agtccccaga gagccgcgcc ctggtcggcc agtcacggag gaagccatgc gaaagcgact 1200 
ttgagaccat caaactcatt agcaacggag cctatggggc cgtctacctg gtgcggcacc 1260 
gtgacacacg gcagcgcttt gccatcaaga agatcaacaa acagaacttg atcctgcgta 1320 
accagatcca gcaggtcttt gtggagcgtg acattctcac ctttgccgag aacccctttg 1380 
tggtcagcat gttctgctcc tttgagaccc ggcgccacct atgtatggtc atggaatacg 1440 
tggaaggcgg cgactgcgcc acgctcctga agaacatggg cccgctgccc gtggacatgg 1500 
cccgcctgta cttcgccgag acggtgttgg cgctggagta cctgcataac tatggcatcg 1560 
tgcaccgtga cctcaaacca gacaatctgc tcatcacctc gcttggccac atcaagctca 1620 
cggacttcgg cctgtccaag atcggcctca tgagcatggc caccaacctc tatgagggcc 1680 
acatcgagaa ggacgcccga gagttcatcg acaagcaggt gtgtgggacg ccggagtaca 1740 
tagcccccga ggtgatcttc cgccagggct atgggaagcc agtggactgg tgggccatgg 1800 
gcgtcgtcct ctatgagttt ctggtgggcfc gcgtgccttt ctttggagat acccccgagg 1860 
aactcttcgg tcaggtggtc agcgatgaga tcatgtggcc agagggagat gaggcccttc 1920 
cagcagacgc ccaggacctc atcaccaggt tgctccggca gagcccgctg gaccgtctgg 1980 
gcactggtgg cacccacgaa gtgaagcagc accccttttt cctggccctg gactgggcag 2040 
ggcttctccg acacaaagcc gagttcgtgc cccagctcga agccgaggat gataccagct 2100 
actttgacac acgttcggaa cgttaccgcc atctgggctc cgaggacgac gagaccaatg 2160 
atgaagaatc gtccacagag atcccccagt: tctcctcctg ctcccaccgg ttcagcaagg 2220 
tctacagcag ctctgagttc ctggccgtcc agcccactcc taccttcgct gaaaggagct 2280 
tcagtgaaga ccgggaggag gggtgggagc gcagcgaagt ggactatggc cgccggctga 2340 
gtgctgacat ccggctgagg tcctggacat cctctggatc ctcctgtcag tcatcttcgt 2400 
cccagcccga gcggggtccc agcccatctc tcctgaatac catcagcctg gacacaatgc 2460 
ccaagtttgc cttctcatca gaggatgagg gggtaggccc aggccctgca ggccccaaga 2520 
ggcccgtctt cattctaggg gagcctgacc ccccaccagc ggccacccca gtgatgccca 2580 
agccctcgag cctttctgcc gacacagctg ctctcagcca cgcccgccta cggagcaata 2640 
gcatcggcgc ccgacactcc acaccaaggc ctctggatgc cggccggggc cgccgccttg 2700 
ggggcccaag agacccagcc cctgagaagt ccagagcctc ctccagcggt ggcagtggtg 2760 
gcggcagtgg gggccgcgtg cccaagtcag cctctgtctc tgccctgtcc ctcatcatca 2820 
cggcagatga tggcagcggc ggccccctca tgagccccct ttccccgcgc tctctgtcct 2880 
cgaacccgtc gtcccgtgac tcttcgccga gccgagaccc gtcccccgtg tgtggcagcc 2940 
tgcggccccc catcgttatc cacagctctg gcaagaagta cggcttcagc ctgcgggcga 3000 
tccgcgtcta catgggtgat agggacgtct acactgtgca ccacgtcgtc tggagtgtgg 3060 
aggacggaag ccccgcccag gaggcgggcc tgcgggctgg ggacctcatc acccacatca 3120 
acggggagtc agtgctgggg ctggtgcaca tggacgtcgt ggagctgctg ctgaagagcg 3180 
gcaacaagat atccctgcgg accacagccc tggagaacac ctccatcaag gtgggccccg 3240 
cccggaagaa tgtggccaag ggccgcatgg cacgcaggag caagaggagc cgtcggcggg 3300 
agacccagga tcggcggaag tcacttttca agaagatctc caagcagacc tccgtgctgc 3360 
acaccagccg cagcttctcc tccggacbcc accactcact gtcatccagt gagagcctcc 3420 
ccggctcgcc cacccacagc ctctccccca gccccaccac tccctgccga agcccagccc 3480 
ctgatgtccc agcagatacc actgcatccc cacccagcgc atccccgagc tccagcagcc 3540 
ccgcctcccc agctgctgct ggccacaccc gccccagctc cctgcacggc ctggctgcca 3600 
agcttgggcc accccgcccc aagactggcc gccgcajagtc caccagcagc atcc.cgccc.t.3660„ 
cccdgctggc ctgcccgccc atctccgcgc cc'ccaccccg ctcgccctcg cccctgcccg 3720 
ggcacccgcc cgcacctgcc cgatccccgc ggctgcgccg gggccagtca gctgacaagc 3780 
tgggcaqagg ggagcggctg gatggggagg cggggcggcg cactcgtggg cc^gaggccg 3840 
agctcgtggt catgcggcgg ctgcacctgt ccgagcgccg agactccttc aagaagcagg 3900 
aggccgtgca ggaggttagc ttcgatgagc cgcaggagga ggccactggg ctgcccacct 3960 
cagtgccaca gatcgccgtg gagggcgagg aagccgtgcc agtagctctc gggcccaccg 4020 
ga^gag^Qtg ^.tcccctgcc aggtctctcc ctggcatcaa agttacgcgt tttcttgtgc 4080 
aatgtttttt ccgtaaagtc atgcctggat ggggactgag ccaccagcct gacacccaga 4140 
aggcgagaag ccatctcggt ccttgctgga aggtggagac atcgcttgtg ttctggtgtc 4200 
aatcaggggg ctggatgggg caagaatggg ggacaagggt ggctttgtaa atagcagcaa 4260 
atccctgcaa ctaatttatt actttttttt tttttttttt' tttttttttt tgagacagag 4320 
tctcactctg ttgcccgggc tggagtgcag cggcgtgatc tcagctcact gcaacctccg 4380 
cctcccaagt tcaagcgatt gtcctgcctc agcttcccaa gtggctggga ttacaggcgc 4440 
ccaccactat gcccagctaa ttttttgtat ttttagtaca gacggggttt caccatgttg 4500 
gtcaggctgg tctcgaactc ctgacctcat gatttgcctg cctttgcctc ccaaagtgct 4560 
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gggattacag gcgtgagcca ctgggcccag cctaatttat tactttttat aagcgatagc 4620 
cgtactgagc cgccccctga aggcggctgc caggtcttgc cccaggcacc tgggactctg 4680 
tttgcaggcc ctgccctctg ggctgagaag gatgcacttt ggacaagtca tctgtgtttg 4740 
tgttttccag tttttctgta ctttttaagt gttttgtgtt acctggtctc attpccctcc 4800 
I ccacacctac ccatttgagg ggatggagtt gaagtcacct ggtcacctgt accggcccag 4860 
ttcggctaca acctggagtg tccgtaaaca attcctctca cccacaaaac aatgtaatcc 4920 
cagcgatgga ctggattctg aaggccactt cccaccatca tagctgccat gcccaggcag 4980 
tgcctgctct atatatagag tctgcctcca atcctgctgg cttcagcctg gagaagggat 5040 
atgggagctg gagctttgat ggatgaatag gtgttcaccg gatctgggca gaggggtcat 5100 
ccgctcccca ggtgggcact gataaaggaa ggtacaggcc tcacctggaa ctgccaaggc 5160 
agcctccaga aatgctcggc tgtctcgggg' cacgctccag tatgccagtc ctgcgggatt 5220 
acgtccagct acttccagaa acactcagtg tcccctcccc tcaggctctg ccttggcctg 5280 
gccttgtcca gtctaccctg gacaagatgc cgtgtgtttg aggcccagca gagtaagccc 5340 
ttggccgtga tgtgtctgaa acacctgtta ggggttccct ccatatgtca gagcctctct 5400 
gggatgaagt tcaagccaga aaacccagtc gaggctcaag tttgaatttc agcttcactg 5460 
tgtggctctg ggaaaatggc tttcccactc tgtgcctcag tttccttgtg tttacaagac 5520 
taatcccatt gactgtttat taagcaccta ctgtgtgcca agcgctttta cgtggcttct 5580 
ccctcagcca gccttgagaa ggctggaggt ggtgtcatca cctccatttt acagacaaag 5640 
cagctgagac cccagcgagg ggcggagacc tgtcccacga tcacccagca ggagtcgtgg 5700 
cagaacggag catcagccag accctgttgt gggcgttgtc atcaagggag cttgaatgga 5760 
gggtctggtg tcagatacag ccgactccag. ccccagctca tcccccatga tgctgtgtga 5820 
cccac^igggc actctggtga gggagctttc cagacatcaa cagcccactc tgcttccctt 5880 
tctgagtccc ctgtccagca ctgcctagtg ttggagggta gaccaaggct gtgcatgatt 5940 
caccccctcc ttccatcctg gagctggcag tgaataaaag cccgtattta c 5991 
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Incyte 

Polypeptide ID 


2509577CD1 i 


7505222CD1 


7524408CD1 i 


7526163CD1 


7526158CD1 
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17505222 
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Table 4 



SEO ID NO'J 
Incyte ED/ 
Seauence T^nDth 


Selected 
rTagmeiiCS 


oequence Jrraginents 


5* Position 


3' Position 




JoW-jyx2, 1- 


GBIJ4EK10.edit 


1 


3912 






56087809H1 


692 


1142 






56087825H1 


692 


1191 






56087901H1 


692 


1374 







56087901JI 


692 


1374 






56087825 J I 


692 


1383 






56087925H1 


692 


1385 






5608780911 


692 


1385 






56087925J1 


692 


1385 






50266 15F6 (COLCDITOl) 


991 


1569 






55065621H1 


1053 


1112 






2509577H1 (CONUTUTOl) 


1118 


1449 







72335521V 1 


1118 


1619 






71869970V1 


1118 


1704 






72335487V1 


1118 


1717 






71871205V1 


1118 


1774 






72648030V1 


1193 


1462 






72645481V1 


1193 


1505 






5507725 IHl 


1193 


1625 






55077255H1 


1193 


1667 






72647270V1 


1193 


1687 






72646262V1 


1193 


1702 






72646441 VI 


1193 


1716 






55077256H1 


1193 


1724 






72646443V1 


1193 


1735 






72334887V1- 


1202 


1884 






g229 18955 


1306 


1624 . 






9653267U2 


1308 


1841 






73282635V1 


1308 


1866 






96575 11U2 


1308 


1989 






6097133F8 (UTRENOT09) 


1308 


2001 






965751 lUl ' 


1308 


2006 






73282635D1 


1323 


2016 






6097133F6 (irreENaro9) 


1333 


1 H'X'X 1 
AOJJ 1 






55056758H1 


1831 


2300 






55056782H1 


1831 


2302 






55056710H1 


1831 : 


2302 






55056742H1 


1831 : 


2302 






55056718H1 


1831 : 


2302 1 






55056790HI 


1831 : 


2302 






55056766H1 j 


1831 : 


2302 






)5056750H1 J 


1831 : 


2302 




< 


>5056726H1 j 


L831 : 


(302 




4 


)5056734H1 J 


831 ^ 


(302 1 



1 



PF-1688P 



Table 4 



|xuiynuwieoiiae 

locquencc JLengtn 
1 ~ 


Selected 
Fragments 


Sequence Fragments 


5* Position 


3* Position 


j 




73281314V1 


1838 


2524 


j 




96351 18U1 


1911 


2622 


j 




9635118U3 


1921 


2796 







9668815U2 


2158 


2951 







73282073V1 


2231 


2430 






gl3996607 


2266 


2685 







9668815U1 


2269 


2951 






73282214V1 


2359 


2951 


1 — 


— 


5546336H1 (TESTNOCOl) 


2441 


2632 


j — : 




5546336F8 (TESTNOCOl) 


2441 


2957 


1 




5546336F7 (TESTNOCOl) 


2441 


2965 


j : 




6999077H1 (BRAXTDR17) 


2466 


3026 







6999077F8 (BRAXTDR17) 


2467 


2943 


— 




6999077R8 (BRAXTDR17) 


2475 


3193 


J — :— 




4123469H1 (BRSTTUT25) 


2676 


2925 






239207 IHl (PROSNONOl) 


2969 


3225 


1 // fJUDZZZK^o U 
j _ 


1848-2289, 1- 
30, 1097-1517, 
228-446 


GBI.g22053291.edit4 


1 


2813 


1 ' ' ' 




GBI.g22053291.edit9 


147 


2813 


j — 




7469170H1 (LUNGNOE02) 


1800 


2249 


j — 




6322794H1 (LUNGDIN02) 


1976 


2224 


j 




6322715H1 (LUNGDIN02) 


1976 


2224 


1 




6322794F6 (LUNGDIN02) 


1976 


2590 


1 — : 




6322794T6 (LUNGDINQ2> 


2139 


2331 


1 — : 




72551805D1 


2290 


2373 


j 




72549212DI 


2290 


2484 






2990137H1 (KIDNFET02) 


2290 


2512 


j — 1- 




72614971V1 


2290 


2573 


J — 




71684926V1 


2290 


2587 


1 




2990137F6 (KIDNFET02) 


2290 


2592 


j 




2775895F6 (PANCN0T15) 


2290 


2596 


1— 




g2046095 


2290. 


2601 - — - 


j 




W(S7485U1 


2290 


2626 






5533976U1 


2290 


2636 


1 




72551755D1 : 


2290 


2641 






71529926V1 : 


2290 


2669 






71530077V1 : 


1290 : 


2673 






72612548V1 : 


2290 : 


2702 






7248203 IDl • : 


i290 : 


2732 






72617183V1 : 


1290 : 


1755 






^2548079D1 : 


mo : 


1160 






322971 8D1 : 


1290 : 


1110 






322971 8 VI : 


Q90 : 


nio 




1 


'2483936D1 |2290 2 


1176 






^2613936V1 I229O 2 


11S3 



2 
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Table 4 



Polynucleotide 
SEQ ID NO:/ 
Incyte ID/ 
Sequence Length 


Selected 
Fragments 


Sequence Fragments 


5* Position 


3* Position 






71530048V1 


2290 


2825 






7 1530248V 1 


2290 


2861 






72483157D1 


2290 


2864 






72551004D1 


2290 


2871 






7i530724Vl 


2290 


2899 






72482548D1 


2290 


2981 






g2051645 


2295 


2804 






2775895T6 (PANCNOT15) 


2315 


2833 


_ 




72482840D1 


2427 


3094 






72483342D1 


2427 


3215 






72548002D1 


2465 


3204 






72551242D1 


2482 


3211 






71914061V1 


2495 


2899 






71530960V1 


2509 


2900 






72550Q24D1 


2536 


3229 






71528089V1 


2538 


2900 






72615074V1 . 


2575 


2812 






2990137T6 (KIDNFET02) 


2596 


2819 






72613394V1 


2649 


3229 


8/7524408CB1/ 
2100 


1620-1641 


95078842J1 


1 


753 






95078918H1 


1 


877 






95078842R6 ' 


2 


669 






95078982R6 


2 


676 






95078918F6 


2 


718 






GBI_jnL008130_GDL8^edit 


2 


2099 






9792742U2 


529 


1448 






9792742U1 


548 


1422 






9792741U2 


631 


1515 






9792741U1 


640 


1476 






95078918J1 


1257 


2100 






95078850H1 


1319 


2100 






95078918R6 


1340. 


2099 






95078842H1 


1345 


2100 






95078850F6 


1353 


2099 


4213 


1 1 A1Q A 1 on 

4213, 3559- 
3648, 2943- 
3018 


551 1426 IHl 


1 


718 






55114385H1 


1 


789 






g304364S_.CD 


11 


4017 






6889766H1 (BRAITDR03) 


14 


590 






6951981F8 (BRAITDR02) 


17 


664 






6951981Hi rBRAITDR02) 


17 


674 






55114385JI 


24 


849 






7751843J1 fHEAONOEOl) 


79 


221 



3 
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Table 4 





It laginenis 


Sequence Fragments 


5* Position 


3* Position 







GBLg220S3494.editl 


169 


4213 






55U4269H1 


171 


849 







55114253H1 


241 


929 






55114377J1 


241 


929 






7676193J2 (NOSETUEOl) 


311 


946 






70505656V1 


380 


828 






6951981R8 (BRATTDRCXZ) 


421 


1058 






55H4261J1 


440 


1069 






6977576F8 (BRAHTDR04) 


466 


1171 






55055940H1 


494 


984 






55055924H1 


494 


1004 






5S055980H1 


494 


1017 






72634030V1 


494 


1019 






72635945V 1 


494 


1037 






72636145V1 


494 


1122 






55055972J1 


494 


1270 






55055956J1 


494 


1279 






72633368V 1 


509 


1065 






70501435V 1 


541 


1058 







70504122V1 


559 


1058 






72469204D1 


654 


1148 






72466303D1 


674 


1307 






72634469V 1 


676 


1286 






70505681V1 


722 


874 






72637358V1 


731 


1267 




• 


72633390V1 


750 


1460 







72464094D1 


850 ' 


1386 






72635309V1 


940 


1516 






8270466U1 


960 


1765 






72633444V1 


963 


1639 






72465303D1 


965 


1553 






72633860V 1 


965 


1590 






72632931V1 


973 


1633 






72464155D1 


975 


1571 






72465039D1 


1023 


1616 






72464370D1 


1071 


1647 






55085554H1 


1079 


1569 






72467 177D1 


1093 


1716 






55085553J1 


1101 


1585 






72466685D1 


1110 


1658 . 






72468522D1 


1110 


1689 






72463771D1 


1110 


1725 






72464836D1 


1117 


1590 






55085554J1 


1120 


1788 




1 


8270465U1 


1127 


1992 






72464654D1 J 


1134 11635 



4 



Table 4 



Polynucleotide 
SEQIDNO:/ 
Incyte ID/ 
Sequence Length 


Selected 
Fragments 


Sequence Fragments 


5' Position 


3' Position 






72467808DI 


1135 


1581 






72466006D1 


1144 


1648 






7246567 ID I 


1145 


1876 






72464963D1 


U57 


1598 






72468460D1 


1161 


1839 






72468357D1 


1162 


1772 






72464181D1 


1196 


1857 






72463830D1 


1204 


1762 






72635760V1 


1207 


1915 






72464363D1 


1215 


1877 






72466089D1 


1262 


1805 






72464123DI 


1277 


1775 






72468937Di 


1280 


1981 






726335 13 VI 


1290 


1915 






g749672 


1298 


1711 






72469574D1 


1303 


1964 






72466314D1 


1349 


1825 






72634480V 1 


1354 


1973 


- 




72637375V 1 


1367 


2079 






72467343D1 


1378 


2014 






72468676D1 


1384 


2081 






72467479D1 


1418 


1946 






55076784H1 


1422 , 


1915 






72467492D1 


1427 


1948 






72464857D1 


1431 


1980 






72467007D1 


1433 


2054 






72463976DI 


1440 


2126 






72466325D1 


1460 


2112 






72463745D1 


1488 


1898 






72634757V1 


1533 


2187 






55055948H1 


1544 


2393 






55055924J1 


1559 


2390 






55055940J1 


1570 


2397- 






55055972H1 


1583^ 


2386 






72466202D1 


1608 


2050 






72466976DI 


1612 


2210 






6826871J1 (SINTNOROl) 


1639 


2204 






gl 1974952 


1639 


2221 






gl4078572 


1639 


2291 






g 14075791 . . 


1639 


2342 






72464236D1 


1641 


2322 






72856783D1 


1666 


2393 






72856783V1 


1676 


2347 






3268480F6 (BRAINOT20) 


1684 


2366 






7972002H1 (CONRTUCOl) 


1692 


2223 






728571 13V1 


1701 


2393 



s 
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Table 4 



Polynucleotide 
SEQ ID NO:/ 
IncytelD/ 
Sequence Length 


Selected 
Fragments 


Sequence Fragments 


5' Position 


3* Position 






72636173Vi 


1704 


2231 






72856562D1 


1706 


2320 






72856562V1 


1706 


2393 






55114293J1 


1711 


2531 






gl 1970048 


1724 


2431 






7623318J1 (HEARFEE03) 


1755 


2310 


• 




72836565V1 


1760 


2393 






55148292J1 


1762 


2192 






55148276J1 


1763 


2192 






6767896F7 (BRAUNOROl) 


1771 


2398 






72468248D1 


1793 


2259 






72467410Di 


1798 


2393 






55055956H1 


1809 


2393 






72839279V1 


1813 


2393 






72468512D1 


1816 


2393 






55148268H1 


1818 


2333 






55148268J1 


1818 


2333 






724641 80D1 


1820 


2393 






7284I950V1 


1856 


2393 




\ 


72468080D1 


1858 


2393 






72472316D1 


1885 


2393 






55076784JI 


1897 


2393 






55076780J1 


1913 


2393 






72633391VI 


1915 


2393 






55076785J1 


1916 


2393 






55076786J1 


1936 


2393 






55076787J1 


1937 


2393 






7960593J1 (TONSDffiOl) 


1980 


2644 






72465305D1 


1993 


2393 






55075185J1 


2005 


2393 


• 




gl2287025 


2021 


2484 






7710071H1 (PANCNOE02) 


2029 


2419 






55075 187J1 


2086 . 


2393 






8247429H1 (LIVRUNFQ2) 


2148 


2792 






55148268J1 


2156 


2180 






30l482o8HI 


2156 


2180 ' 






73208142D1 


2276 


3024 






73208142V1 


2289 


3020 






W68312U1 


2314 


3220 






g7142756 . . 


2329 


2786. . 




J 


57142755 


2329 


2829 






7582850H1 (BRAIFECOn 


2351 


2940 




■* _ 


7582850F6 (BRABPECOl) : 


2363 


3123 






57142551 : 


2586 


3050 






7623318H1 {HEAREEE03) : 


2602 


3014 






^960593H1 (TONSDIEOn : 


2628 


3274 
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PF-1688P 



Table 4 



Poljmucleotide 
SEQIDNO:/ 
Incyte ID/ 
Sequence Length 



Selected 
Fragments 

) 



Sequence Fragments 



5' Position 



3* Position 



73209446D1 



73209614D1 



2696 



73210136D1 



73209446V1 



£3162737 



7717974H1 (SINTFEE02) 



2731 



2840 



2863 



2945 



gl 1975870 



6993194H1 (BRAQTDR02) 



8179I53H1 (EYERNONOn 



73207920D1 



73207920V1 



g7113208 



g7112889 



7219549H1 (SFLNDICOn 



72195 49F9 (SPLNDICQ l) 



1894680H1 (THP1TXT0 4)^ 



189468aF6 (THP1TXT04) 



8166636H1 (MIXDTXEOn 



34C^ 
3428 



3595 



3627 



363^ 
3638 



3711 



3720 



3809 



3809 



3909 



3909 



4059 



3341 



3244 



3377 



3679 



3400 



3605 



3935 



4034 



4183 



4061 



4091 



4185 



4189 



4191 
4212 



4160 



4181 



4213 



10/7526158CB1/ 
5991 



1-1731,2436- 
4780 



55114385H1 



GBLg22053494.editl 



55114385J1 



20 



GBIg22053494.edit4 



36 



551 14285 Jl 



66 



7751843J1 CHEAONOEOl) 



99 



882 



5991 



942 



335 



942 



314 



56013762J1 



133 



55114353H1 
55114369J1 



175 



214 



335 



942 



942 



55114277H1 



241 



GBLgl7455923.editl 



262 



55114393H1 



333 



55114277H1 



333 



7676193J2 (NOSETUEOl) 



404 



70505656V1 



7526158F6 (null) 



473 



473 



334 



4306 



941 



942 



1039 



921 



932 



6951981R8 (BRAITDR02) 



514 



1151 



551 14261 Jl 



533 



1162 



6977576F8 (BRAHTDR04) 



559 



1264 



55055940H1 



587 



55055924H1 



587 



55055980H1 



587 



1077 ~ 



1097 



1110 



72634030V1 



587 



1112 



72635945V1 



587 



72636145V1 
55655972J1 



587 
587 



1130 



1215 
1363 
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mm, 



PF-1688P 



Table 4 



iPoIynucleoticfe 
ISEQIDNO:/ 
ilncyte ID/ 
ISequence Length 



Selected 
Fragments 



5* Fbsitlon 13' Position 



55055956J1 



72633368VI 



70501435V1 



70504122V1 



72469204D1 



72464155D1 



55085554H1 



72467177D1 



55085553J1 



55085554J1 



8270465U1 



7246567 IDl 



72468460D1 



72463830D1 



72464123D1 



72468937D1 



g749672 



72466314D1 



72467343D1 



72468676D1 



72467479D1 



55076784H1 
72467492D1 



72464857D1 



72463976D1 



72466325D1 



72463745D1 



72634757V 1 



55055948H1 



587 



1372 



602 



1158 



634 



1151 



652 



1151 



747 



1241 



1068 



1664 



1172 



1662 



1186 



1809 



1194 



1678 



1213 
1220 



1881 



2085 



1238 



1969 



1254 
1297 



1932 



1855 



1370 
1373 



1868 



2074 



1391 
1442 
1471 



1804 



1918 



2107 



1477 



1511_ 
1515 



2174 
2039 



2008 



1520 



2041 



1524 



2073 



1533 



2219 



1553 



2205 



1581 



1991 



1626 



2280 



1637 



2486 



55055924J1 



55055940J1 



55055972H1 



72466202D1 



72466976D1 



6826871J1 (SINTNOROn' 



gl 1974952 



gl4078572 



gl4075791 



72464236D1 



72856783D1 . 



72856783V1 



3268480F6 (BRAINOT20> 



1652 



2483 



1663 
1676 



2490 



2479 



1701 



2143 



1705 



2303 



1732 



12297 



1732 



12314 



1732 



2384 



1732 



2435 



1734 



2415 



1759. 



12486 



1769 



2440 



1777 



2459 



7972002H1 (CONRTUCOl) 



728571 13 VI 



1785 



2316 



1794 



2486 



72636173V1 



1797 



72856562D1 



2324 



1799 



12413 
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IT-1688P 



Table 4 



iPolynucleotide 
SEQIDNO:/ 
llncyte SD/ 
ISequence Length 



Selected 
Fragments 



Sequence Fragments 



72856562V1 



55114293J1 



gl 1970048 



762331811 (HEARFEE03) 



72836565V1 



6767896F7 (BRAUNOROl) 



72468248D1 



72467410D1 



55055956H1 
72839279V1 



72468512D1 



72464180D1 



72841950V1 



72468080D1 



724723 16D1 



55076784J1 



55076780J1 



72633391V1 



55076785J1 



55076786J1 



55076787J1 



7960S93J1 (TONSDIEOl ) 



5' Position 



1799 



3' Position 



1804 



1817 



1848 



1853 



1864 



1886 



1891 



1902 



1906 



1909 



1913 



1949 



1951 



1978 



1990 



2006 



2008 



2009 



2029 



2030 



72465305D1 



550751 85 Jl 



gl2287025 



7710071H1 (PANCNOEQ2) 



55075187J1 



82VI7429H1 {L1VRUNF02) 



73208142D1 



73208 142 VI 



9468312U1 



2073 



2086 



2098 



2114 



2122 



2179 



2486 



2624 



2524 



2403 



2486 



2491 



2352 



2486 



2486 



2486 



2486 



2486 



2486 



2486 



2486 



2486 



2486 



2486 



2486 
2486 



2486 



2737 



2486 



2486 



2577 



2512 



2241 



2369 



2382 



2407 



2486 



2885 



3117 



3113 



3313 



g7142756 



g7 142755 



2422 



2879 



2422 



7582850H1 (BRAIFECOl) 



2922-. — r- 



2444 



7582850F6 (BRMFECOU 



3033 



2456 



g7142551 



3216 



2679 



7623318H1 (HEARFEE03) 



2695 



7960593H1 (TONSDIEOn 



2721 



73209446D1 



2789 



3143 
3107 



3367 



3434 



73209614D1 



73210136D1 



2824 



3337 



73209446V1 



2933 



3470 



2956 



g3 162737 



3772 



3038 



7717974H1 (SINTFEE02) 



3493 



3495 



gi 1975870 



3698 



3521 



4028 



8179153H1 (EYERNONOn 



3720 



14276 
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PF-1688P 



Table 4 



Polynucleotide 
SEO ID NOV 
Licvte ID/ 


FfHgments 


oequence Jriagments 


5' Position 


3* Position 






73207920D1 


3725 


4154 






73207920V1 




4164 






6993194H1 (BRAQTDR02) 


3740 


4127 






g71 13208 


3804 


4278 






g71 12889 


3813 


4282 






7219549H1 (SPLNDICOl) 


3902 


4289 






7219549F9 (SPLNDICOl) 


3902 


4305 






1894680H1 (THP1TXT04^ 


4002 


4253 






1894680F6 {THP1TXT04) 


4002 


4274 






8166636H1 (MIXDTXEOl) 


4152 


4306 






GBI.g22053494.editl 


4403 


4463 






GBLfi22053494.editl 


4537 


4598 



10 



Representative Library 


TESTNOCOl 1 


LUNGDIN02 1 


BRAITDR02 1 


THP1TXT04 


Project ID: 


7CB1 


7505222CB1 \ 


ICBl 1 


[58CB1 




250957: 


\o 


Incyt 


7526] 


7526J 


leotide SEQ 










Polynuc 
ID NO: 


o 1 




< 

3\ 


o 




oo 

VO 



o 



o g o 

•a -s 

2 r: -a 

•a -3 > 

^ SI 

to r— ^ 




O 
CO 

II 

1 



£ J 



o 

o 



si 

s S 



,2 

ON - 

2 1 



OS 

00 

to 
cn 
«n 

CM 



-3 



o 12 



d> ID 



-Sis 

< ?g 




I 

o • 
^ v> 

^ OH 



4) 

to •'J 

'S "2 't> 

111 
1^ I 

NO T3 O 

ft 1 <3 



4> 

•§ PC 

^1 



^ — \ri c 
^ s*-. S « 

I 



SU NO 



CO _s ^ 



i 



„ ON 

w ■« ^ 



g 

I 



J2 



§ . 

> S 

11 

6 .5 
J J 

Sen 





< 




04 
QO 
OO 

VO 
I— » 



S 




04 

oo 
oo 
so 

i 



Hispanic 
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frequency 
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Asian 
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frequeiK 










0.67 
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